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Microsoft cares too much 


I n the 1980s, nobody would mistake 
Microsoft for a caring, cuddly compa¬ 
ny. Instead, Redmond was the place 
where other software companies were 
quietly strangled behind doors. Bill 
Gates plotted the demise of competi¬ 
tors, and Steve Ballmer was known to 
use his loud voice to get his point across. 

Microsoft was ruthless, and seem¬ 
ingly didn’t primarily focus on the 
needs of its customers when imagining 
software features. Rather, Microsoft 
was focused on killing the competition, 
locking its customers into proprietary 
file formats, playing it safe, being all 
things to all people, and earning large 
stacks of money. 

On the other side of the coin, Apple 
in the 1980s came across as a happy, 
touchy-feely place where the user expe¬ 
rience and vision were paramount. The 
feeling was that Steve Jobs and company 


really wanted to make something awe¬ 
some, take risks, and be willing to be elit¬ 
ist and controversial. It seemed that mak¬ 
ing money wasn’t the point, although 
they proved to be good at that as well. 

Fast-forward. Today, Microsoft is the 
market leader in dozens of categories. It 
has millions of users and developers 
worldwide, all deeply tied to their deci¬ 
sions. Apple, on the other hand, seems 
incredibly preoccupied with its bottom 
line at a level that only Microsoft and 
Oracle have previously seen. 

And yet, while both companies 
recently had the same visions—desktop 
and mobile operating systems merging 
into one—only Apple has had the guts to 
follow through on that vision. Microsoft 
tried to push its new tiles onto the Win¬ 
dows 8 crowd, only to quickly give into 
the outcry that came from its users and 
developers over the removal of the desk¬ 


top Start button, for example. 

Meanwhile, Apple blithely ignores 
its users demands, only implementing 
them after great struggle and strife. 
Take, for example, the fact that iOS7 is 
finally getting some real multi-tasking 
capabilities. With each new release of 
Windows 8, Microsoft seems to be 
moving closer and closer to its older 
systems, refusing to remove features 
that made people feel comfortable. 
They’re listening too much to their 
users; it’s actually harmful. 

Apple doesn’t seem to care what you 
or developers want. It thinks it knows 
best, and it’s going to make all the deci¬ 
sions for you. It’s a shift we’re surprised 
by, but it doesn’t make either side right. 
Microsoft could use some backbone to 
stand up to its users and finally depre¬ 
cate some old tech. Apple, on the other 
hand, needs to stop taking away from its 
users, and start giving them the features 
they deserve and want. Somewhere in 
the middle is a happy medium. I 


New database tech: Immature but valuable 


T he world of relational database 
engines is mature and solid. The 
landscape of new databases is anything 
but. 

When you venture outside the tradi¬ 
tional RDBMS (and traditional 
RDBMS vendors like IBM, Microsoft 
and Oracle), choice abounds right now: 
SQL, NoSQL, MySQL, SQLite, graph 
databases, and key-value stores. Opin¬ 
ions vary about which database is the 
best. Developers can go the open- 
source route or stick with proprietary 
solutions. Is a NoSQL or relational 
database better for your application? 
The answer is no longer clear. This is 
the time for experimenting to see what 
works, both from the developer’s stand¬ 
point and the provider’s standpoint. 

Venture away from the relational 
database and enter the Wild Wild West. 
Companies and open-source projects 
are experimenting like mad, pushing 


the boundaries of what can be done. 
Everybody is scrambling to get their 
ideas out, throwing them against the 
wall to see what will stick. 

That’s exactly how it should be right 
now. These new technologies—and 
their applications—aren’t anywhere 
near ready for the establishment of 
standards. We don’t want a horse race; 
we want a mad dash. 

Developers don’t have to choose one 
or the other anymore; instead they can 
use both. They can use open-source 
NoSQL databases like MongoDB or 
IBM’s DB2 relational database, since 
they’rre giving you that option now. 

As you’ll read in this issue on p. 26, 
IBM and lOgen are trying to bridge the 
NoSQL and relational worlds by collab¬ 
orating on an open standard that will let 
developers access data held in both 
MongoDB and DB2. Meanwhile, Fat- 
Cloud is bridging the Microsoft SQL 


Server world with the NoSQL world, 
giving .NET developers choice as well. 

But the story is not really about these 
companies collaborating; it’s about these 
companies seeing what makes sense and 
trying to take things in different direc¬ 
tions. This is just the type of innovation 
one would expect in a burgeoning area, 
and we welcome this kind of work. 

You should welcome this mad scram¬ 
ble as well. If you can get value from a 
newfangled database, don’t let the lack 
of industry maturity hold you back. 
Adopt early; try multiple solutions; be 
willing to experiment with software that 
you might want to replace in a few years. 

Eventually the market will shake 
out, and perhaps some of these efforts 
will wither on the vine. Until it does, 
find what works for you now and imple¬ 
ment it, and then watch with amaze¬ 
ment—or is it amusement?—as fads 
and buzzwords go tumbling by. I 
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SDTimes on the web 


Get Linked H. 


The talk about Windows 8.1 is on the restoration of the 
Start button, but not in the way users were accustomed to. Do you 
think Microsoft is taking a step backward with 8.1? 


The problem isn't just the Start button, but the insistence, 
that running one poor Metro app at a time is better than running a 
desktop full of high-quality Windows apps. 

I don't want apps in my start menu, but that was not the 
question. Is it a step backward? No, it is a small step 
forward that doesn't make up for the three large steps back. 


m 


s dZ2El I think Microsoft is out of touch with its user base. 
Trying to make an OS that is good for both desktops and tablets is 
a bit like building an SUV that also performs like a sports car. It 
won't be cheap or easy to develop, and in the end will likely fall 
short of expectations. 

What are they thinking? They're thinking they 
need an OS that works for tablets, so keep it there! Windows 8 is so 
cumbersome on a laptop, and honestly, the Start button is the least 
of my concerns. 

I actually like Metro. Try this: Add your favorite apps 
to the Metro home page. Anytime you want to work with some¬ 
thing on Windows, and it isn't convenient to do on Metro's desktop 
or your Metro shortcuts, simply use search. 

Free your mind and just look for what you need. I found it to be 
faster and more efficient than any OS. I don't think you will be dis¬ 
appointed with Windows 8 anymore. You may even want to get rid 
of the Start button. 


Best practices for agile 
and DevOps teams 

If your team is doing agile or DevOps, work doesn't 
have to be so difficult. CA Technologies' Shridhar 
Mittal has five steps to easing things up. 


1. Embrace the cloud 

2. Shift quality left 

3. Automation is your 
friend 


4. Break down your 
team's walls 

5. Embrace transparency 
for agile development 



Read more at sdt.bz/60784. 



HSEBB What do YOU think the effect of 
buzzwords/marketing has on our field? 


S 0B I like buzzwords. They make it easy I 11 *! 

for me to relate what I'm doing others around me, and for 
them to think they understand. I work for an advertising 
agency though, so they're highly susceptible to marketing. At 
the end of the day, I do wish that people in general would take 
a little more interest in the work that's being done to enable 
shiny new software, but most don't want to know. And their 
eyes glaze over when you try to tell them. 


I agree that it helps those around us think they 
know what we are doing. I find that if people think they know 
what's going on, they tend to leave me alone to do what I 
enjoy. Many times, however, I think of buzzwords as repaint¬ 
ing a car. It looks different but doesn't drive any differently. 



► Be ready to walk out of your cloud in 
30 seconds flat... 

Recent NSA revelations don't mean a lot to your cloud opera¬ 
tion, if you have one. But sometimes things like this cause boss¬ 
es to become a bit hasty: "Be ready to spread your apps around 
the globe. These NSA revelations probably won't do a whole lot 
of damage to American hosting; there are so many 
well-heeled customers here in the U.S. that hosting 
outside its borders would quickly yield Web slow¬ 
downs that would hurt business." A more detailed 
explanation can be found at sdt.bz/60773. 


► Will open source fight for 
the Big Apple? 

New York has awarded an exclusive contract to Nextdoor 
for building a citywide social network. It sounds like a 
very ambitious project: "Residents will be connected 
based on their verified addresses and see themselves 
'fully integrated with New York government departments, 
to be used by police, fire, utility and other agencies.'" But 
can the Five Boroughs put out their own social network to 
compete? Read more at sdt.bz/60788. 


WiMiiram My biggest issue with buzzwords is defining 
them. Cloud is a great example: Is it on the Cloud? For the 
Cloud? I sometimes think having new words for the sake of it 
just makes some people feel "in the know." 


I see the value of buzzwords creating a bit of 
structure and enabling a starting point for conversation, but 
they are far overused and oversimplified. "The cloud" is a great 

example of complete ambiguity, "Big Data" is frantically trying 
to catch up. But I do think that "fetch" should catch on, and by 
the end of the movie you almost find yourself saying it. 


Join the SD Times group at 
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Visual Studio 2013, 
TFS 2013 announced 

Microsoft continues its push to scale 
agile up to business decision-makers 


BY DAVID RUBINSTEIN 

The latest versions of Microsoft’s devel¬ 
opment environment Visual Studio 
2013, and its application life-cycle man¬ 
agement tool suite Team Foundation 
Server, were announced at Microsoft’s 
TechEd conference in June. 

The company also announced the 
acquisition of TFS release-management 
software InRelease from a company 
called InCycle Software, as well as a Git 
option that will be available on-premise 
within Team Foundation Server in 2013. 

Brian Keller, Microsoft’s principal 
technical evangelist for Visual Studio, 
said that a preview of the software was 
expected to be made available starting 
on June 26 with a Go-Live license so 
users can start working with the soft¬ 
ware right away. 

With the new releases, Microsoft 
continues to discuss the ability to scale 
agile beyond development teams to 


include business decision-makers as 
well as operations teams that have to 
deploy and manage the applications in 
the wild. Microsoft calls this agile port¬ 
folio management, Keller said. 

Other highlights of the release 
include an ambient indicator within the 
code editor to show developers the sta¬ 
tus of their work vis-a-vis tests. “Within 
the code editor, we have a heads-up dis¬ 
play for the coder,” said Keller. 

“When you’re in a method body, for 
example, we can get an ambient indica¬ 
tion in the background that shows you 
things like, c Hey, for this particular 
method, are my unit tests passing?’ 
And, if the last time you ran your unit 
tests they failed and they happen to 
exercise this particular method, you can 
get a visual indicator right above your 
code that helps you understand that. I 
can look at that same method, and 
there’s another indicator that tells me 



who is the last person to make a change, 
[and] what are the last five changes. 
And so when I mouse over that, it tells 
me Jason was the person to make a 
change, and the reason he made that 
change was to fix this bug, and all this 
information comes straight out of Team 
Foundation Server and lights up for the 
developer.” 

Then there’s cloud-based load test¬ 
ing. This has been built into Team 
Foundation Server since 2005, but 
under the old model, developers had to 
provision the hardware themselves to 
generate the load. Keller said Microsoft 
thought this was a great scenario to take 


TechEd notes from Brian Harry 

At this year's TechEd, Brian Harry (product unit manager 
for Team Foundation Server) announced Visual Studio 2013 
and Team Foundation Server 2013. He later blogged about 
what these mean for developers... 


"At TechEd, We enabled some of those [ALM] features on 
Team Foundation Service to try out immediately, and I 
announced that a preview of VS 2013 and TFS 2013 will be 
available at the Build conference. 

It's an exciting time now that we can start talk¬ 
ing more openly about what's coming in our next 
major release. As usual, there's so much that I will 
only be able to just skim the surface with this post. 
Stay tuned for many more posts on my blog, the 
ALM blog, the Visual Studio blog and others as we 
reveal more detailed information about all of the 
new capabilities. Also, check out Soma's blog for 
his perspective on TechEd's announcements. Of 
course I'll post again with download links as soon 
as they are available." 

A (very) detailed breakdown is available at 
tinyurl.com/brianharryteched. 



Lightweight code commenting has been added to Team Foundation Service. 
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Brian Harry demos upcoming 
Visual Studio 2013 features. 


to the cloud. 

The Git integration option has been 
available through Team Foundation 
Service since January, but now it will be 
available in Team Foundation Server 
2013. “On the server-side, with Team 
Foundation Server, I can create a proj¬ 
ect and either choose from my tradition¬ 
al Team Foundation Server version con¬ 
trol, which is the centralized version 
control system we’ve always had, or I 
can choose a Git repository,” said Keller. 

“The nice thing about our approach 
there is that you don’t sacrifice the ben¬ 
efits of integrated ALM by choosing 
Git, because the way that we’ve built it, 
it still can be linked and you get the 
traceability over with your work items, 
so as I’m checking in code, I can see 
that this piece of code was to fix this 
bug, and that bug fix is available in this 
build, and so on.” 

Keller also said Microsoft has 
shipped a set of tools for Visual Studio 
that allows developers to connect to Git 
repositories straight from the Visual Stu¬ 
dio client. That is already available with 
Visual Studio update 2, which requires 
an additional package. That capability 
will be built into Visual Studio 2013. 

Keller emphasized that this is not a 
Microsoft implementation of Git; the 
company has worked with the open- 
source community to ensure that it’s 
the same set of interfaces that a devel¬ 
oper would see if he or she were work¬ 
ing with Bitbucket, Git or GitHub. I 


Answering the call for 
stronger software security 

Testing tools should be built within a process 


BY ALEX HANDY 

The security landscape has been evolv¬ 
ing along with technology. At the same 
time, as businesses embraced the cloud 
and new software-development tech¬ 
nologies, exploit writers and black-hat 
hackers changed their tactics. 

Tim Rains, director of product man¬ 
agement at Microsoft for Trustworthy 
Computing, spoke at the recent Security 
Development Conference in San Fran¬ 
cisco, saying that Microsoft’s own securi¬ 
ty practices were molded out of the need 
to quickly and accountably track vulnera¬ 
bilities from disclosure to patching. He 
said that the processes Microsoft adapted 
over years of internal security work are 
what make up SDL now. 

“At Microsoft, the goals are pretty 
simple. One: reduce the number of 
these vulnerabilities, and two: reduce 
the severity of the vulnerabilities left in 
products after they ship,” said Rains. 
“As long as humans make software, 
there’ll be mistakes. For those vulnera¬ 
bilities left over, let’s make them really, 
really hard to exploit. We’ve been trying 
to share more and more of this over 
time. SDL is a methodology for creat¬ 
ing software, and a tool set to support 
that methodology.” 

Rains said that, when it comes to 
security, building a process is key. But 
just having a process is not enough, he 
added. What’s more important is having 
a process around which tools can be 
built. Thus, he said, Microsoft has been 
providing some tooling around SDL, 
tools that don’t need to be intrusive or 
overly complex, however. 

“In the newest version of Visual Stu¬ 
dio, they have an SDL switch. In previ¬ 
ous versions of the compiler, you’d have 
to know which switches to check. 
They’ve made it really easy now, where 
it’s a single checkbox. You check it, and 
you’ll get all the goodness that goes into 
these safety mitigations,” said Rains. 


Other tools that have become popular 
for securing software applications are 
code vulnerability scanners and in-IDE 
standards enforcers. Companies like 
Coverity, HP, Klocwork and Veracode all 
offer tools that can be used to spot vul¬ 
nerabilities in software, but they all have 
a classic Achilles’ heel: the false positive. 

False positives mean bringing 
humans into the process to identify 
which vulnerabilities that are detected 
aren’t actually critical bugs, but bring¬ 
ing people into the process inherently 
slows it down. That’s why McCabe IQ 
takes a step up the ladder to offer a 
more holistic form of code scanning. 

David Belhumeur, CEO of McCabe 
Software, said that McCabe IQ can be 
used to determine which detected vul¬ 
nerabilities are actually important. 
“You’ve identified these hundreds of 
thousands of vulnerabilities, now what 
do we do?” he said. “From a security 
perspective, we analyze the impact and 
the context of those vulnerabilities. It’s 
a way to prioritize and help you with 
that end of things. You’re saving money 
and saving time by focusing on the most 
critical vulnerabilities. 

“In testing, we analyze security test¬ 
ing. Code coverage is what people know 
us for, but we do the analysis of the vul¬ 
nerable code. An organization that’s 
sophisticated is going to do some test 
coverage on that. We go down to the 
line coverage, but also we are known for 
complexity analysis. We do risk analysis. 
You want to make sure those vulnerable 
areas of the code can be analyzed.” 

Belhumeur said that with some 
security code-scanning tools, false-posi¬ 
tive rates can be “up to 50%. That’s just 
the nature of the beast. Where we help 
with that is that if you can target that 
manual review and focus on those criti¬ 
cal areas, and see these vulnerabilities 
in context,” you can better respond to a 
crisis, he said. I 
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A dream for all drones 

DreamHammer's quest for an OS for unmanned vehicles 


BY ALEX HANDY 

The world has big plans for unmanned 
aerial drones; developers just have to 
build out all the complex software that 
is needed to enable those plans. Until 
now, every drone has been its own soft¬ 
ware island, with purpose-built drivers 
and routines tailored specifically to 
every device and every company. But 
DreamHammer hopes to change this 
by building a more generic, general- 
purpose drone operating system. 

DreamHammer CEO Nelson Paez 
said that drone technology is maturing 
in a fashion similar to that of the per¬ 
sonal computer. When those devices 
first arrived, they were also typically 
quite purpose-driven, he said. Wang 
computers, for example, were initially 
calculators, then word processors. That 
was the task they were charged with, 
and that was all they could do. 

Over time, more general-purpose 
operating systems arrived, and the per¬ 
sonal computer was able to do a lot 
more than just process words and count 
digits. So too, said Paez, will it be with 
drones. While current drone technolo- 
are largely focused on military 



DreamHammer's Ballista operating system allows drones from multiple manufacturers to 
behave similarly through a common, unified interface. 


g ies 

applications, he sees a future 
approaching in which drones are 
everyday part of business. 


fast 


While that might not happen 
overnight, Paez said that drone vendors 
are already amenable to the idea of a 
generic drone operating system. When 
using drones in a business situation, he 
said, “They can be from three different 
manufacturers, be three different vehi¬ 
cles, have three different ways of mov¬ 
ing a gimbal...” But by using 
DreamHammer’s Ballista operating 
system, those drones can be operated 
with the same soft 
ware, despite being ^ 
totally different at 
their cores. 


i 





The Tacocopter Mission 

In San Francisco's Mission District, Mexican cuisine is a way of life. This is 
where the famous “Mission Style" of burrito comes from, which you can now 
find at any Chipotle. Yet, no matter how many tortillas and enchiladas one 
scarfs down, hunger is only just a few hours away. 

Today, San Francisco residents must walk to their cocina of choice 
when hunger rears its grumbly head. But in the future, drone-based 
taco delivery could slake appetites from Pacific to Bay. 

Tacocopter is the brainchild of Star Simpson, an MIT graduate 
currently working at educational toy company Canidu. Tacocopter isn't actually a compa¬ 
ny yet, however. Instead, it's a proposed business model awaiting regulation changes from 
the FAA. It's not technically legal to unleash an autonomous taco-delivery quadcopter into 
civilian airspace for commercial purposes, you see. But with any luck, regulations will 
ease, and Simpson and her team will build up this company. They've already demonstrat¬ 
ed short-range taco-delivery mechanisms at robotics festivals. 

When regulations change, the real work begins. The team has already chosen a 
source for Tacos: El Tonayense, a well-respected taqueria. Then comes the calculations: 
average weight of a beef taco versus average weight of an al pastor taco. And, at what 
point does it become feasible to upgrade drones to carry a burrito payload? I 


“We provide standardization for 
drones and robots by doing that,” said 
Paez. “The beautiful thing about it is 
[drone vendors] like that. They see it as 
the future for all of them. Customers 
will buy more if there are more applica¬ 
tions.” And he’s hoping there will be a 
lot more applications. 

Paez said the FAA is working on 
changing drone laws for commercial 
uses in the United States. This would 
enable shipping companies to use 
^ drones, and would also enable the 
San Francisco-based startup, Taco¬ 
copter, to take flight. Tacocopter is a pro¬ 
posed drone-based taco-delivery service. 

In the future, Paez said he hopes to 
offer a community edition of Ballista, 
where the drone operating system 
could be used to pilot tinkering projects 
and hobbyists as well as the mainline 
workhorse drones. He envisioned a 
world where drones delivering ship¬ 
ments for FedEx run the same operat¬ 
ing system as a drone used around the 
house or in war. He said he wants Bal¬ 
lista to run on all five grades of drones. 

“We want to be able to provide, in 
the near future, for the small vehicles,” 
he said. “Group 5 is a large drone like 
the Global Hawk; group 1 is a handheld 
drone like a Tacocopter. We want to 
provide the software for that communi¬ 
ty for free.” I 
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What MOOC’s moment 
means for managers 


software development team leaders 
hitch a ride on Massive Online Open Courses? 


BY ALEXANDRA WEBER MORALES 

Do you need to ramp up parallel pro¬ 
gramming expertise on your team? Uni¬ 
versity of California, Davis associate 
professor John Owens will simultane¬ 
ously teach thousands of developers 
across the world how to use the CUD A 
environment to maximize GPUs for free 
in Udacity’s “Introduction to Parallel 
Programming” class. Or perhaps the 
robotic car programming class by Sebas¬ 
tian Thrun, cofounder of Udacity and 
one of the inventors of Google’s self-dri¬ 
ving car, is more your company’s speed. 

Ivy League providers and prestigious 
names like Thrun are fueling todays 
massive online open course momentum 
(for academics, anyway). Will it hold true 
for those seeking Microsoft Certified 
Solutions Expert or Cisco certification? 

“On the academic side, there’s 
already a hierarchy,” said Bryant Niel¬ 
son, managing director at CapitalWave, 
a New York-based financial training 
company that is now launching corpo¬ 
rate MOOC-based continuing educa¬ 
tion. “Anything delivered by Harvard is 
very attractive, more so than Penn 
State. In the corporate arena, it’s entire¬ 
ly different. They don’t care as long as 
it’s high quality.” 

Since the term was coined in 2008 by 
Canadian academics, MOOCs are burst¬ 
ing out of the Ivory Tower. Coursera and 
Udacity, startups that matriculated from 
Stanford, are gaining registrants in the 
tens of thousands. Noted professors 
such as Gregor Kiczales of the Universi¬ 
ty of British Columbia, or Rob Rutenbar, 
head of computer science for the Uni¬ 
versity of Illinois at Urbana-Champaign, 
are available to engage developers as 
they delve into new areas or fine-tune 
design skills. Microsoft’s Virtual Acade¬ 
my is considering MOOC-like offerings 




to complement its Jump Start and Live 
Q&A series, according to a recent blog 
post by Matt Calder, publishing manager 
for MVA. 

Learning together 

What makes MOOCs more attractive 
than other forms of online education, or 
the plethora of college and conference 
lectures currently available on 
YouTube? The time-boxed nature of 
the courses is key, which means short 
lectures reveal information sequentially 
to a group of students who bond over 
their typically six-week-long journey via 
forums and peer reviews (in this 
respect, Khan Academy doesn’t qualify 
as a MOOC because courses are on- 
demand). Testing, certificates or badges 
(such as Mozilla’s Open Badges) are 
motivators, too. Finally, the platforms 
themselves offer useful metrics that 
make students feel like they’re pro¬ 
gressing toward their learning goals, as 
well as analytics that can feed back into 
better pedagogy. 

Benefitting from high-profile cover¬ 
age that comes on the heels of students’ 
tuition protests and other economic 
woes, MOOCs have enjoyed some pop¬ 
ularity. “The three big MOOCs were 
launched a year ago,” said Nielson. “I 
was aware of Khan Academy for a cou¬ 
ple years, but not sure how it was going 
to be impactful. We started pitching in 
January, and just signed very large cor¬ 
porate MOOC contract yesterday.” 

He cautioned, however, that corpo¬ 
rate MOOCs are a different beast: “A 


MOOC for the commercial market is 
only a framework. The idea of having 
10,000 or 100,000 people sign up is 
motivating the academics. In corporate, 
it’s not. It’s the video delivery, the 
engagement through testing and online 
discussions. It straddles instructor-led 
and e-learning, but it’s not a siloed 
event like e-learning. It’s going to be 
incredibly disruptive.” 

Marketing for extra credit 

In areas of high demand, now may be 
an excellent moment to straddle the 
corporate and academic worlds with a 
technology-focused MOOC—all while 
earning a marketing boost as extra cred¬ 
it. That’s exactly the strategy SAP has 
taken to encourage expertise in its SAP 
HANA in-memory database platform, 
launched two years ago. 

“Of course, we have classroom train¬ 
ing for developers and e-learning cours¬ 
es, but for SAP HANA, we wanted to 
spread the knowledge much broader and 
reach audiences not normally reached, 
such as university students and partners,” 
said Bemd Welz, senior vice president, 
and head of Solution and Knowledge 
Packaging for SAP. “We found the 
MOOC format interesting because of 
the scale and the flexible learning for 
professionals as well as students.” 

The six-week course, which began in 
May, requires approximately four hours 
a week to view the videos, do the exer¬ 
cises and take the tests. So far, demand 
has well exceeded normal e-learning 
continued on page 20 ► 
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What MOOC's moment means for managers 


◄ continued from page 18 

signup rates. “We announced the 
course two weeks ago and have more 
than 7,910 registered,” said Welz. 

The course will use the OpenHPI 
platform pioneered by the Hasso-Plat- 
tner-Institut in Potsdam, Germany, 
which launched last year. “The course 
that HPI did on in-memory computing 
had more than 17,000 students. They 
pioneered the MOOC format in Ger¬ 
many,” said Welz. “We have a couple of 
topics in the pipeline if this is successful.” 

A fraction of the cost 

Learning management system (LMS) 
vendors are taking note of MOOCs 
pedagogical features and launching 
their own versions for the corporate 
world. Security is a corporate require¬ 
ment for many, while scalability may be 
less of a concern, according to Capital- 
Wave’s Nielson. 

“Most LMS vendors would want to 
migrate to a MOOC framework, but it’s 


Now may be an 
excellent moment 
to straddle the 
corporate and 
academic worlds. 


not necessarily going to be massive and 
it’s not going to be open,” he said. “It’s 
just like how now, at Penn State, they 
use [the LMS] Blackboard, but it’s only 
available for registered students.” 

But the savings in terms of corporate 
training could be massive indeed. No 
more travel costs, no running multiple 
instructor-led programs for groups of 
15 employees, no worrying about time 
constraints, and no repetitive on-board¬ 
ing of new employees, said Nielson. 
Most MOOCs are currently free, but 
future business models may revolve 


around headhunting or credentials. 

In the meantime, while providing 
high-quality videos and well-designed 
courses that effectively use MOOC 
platforms are key to attracting students 
to a course and educating them, the 
secret of MOOC’s success lies in con¬ 
nectivity, the learning theory that’s 
defining the digital age. The MOOC 
can’t take the place of all training, but it 
refocuses learners toward a technologi¬ 
cally facilitated exchange of knowledge 
in fast-changing fields. Though they 
don’t often realize it until they try a 
MOOC for the first time, peer review 
and discussion are often more memo¬ 
rable than the instruction by prestigious 
names or institutions. 

“The community is open, so learners 
can discuss among themselves,” said 
Welz. “We’ll have experts online too. 
We can leverage the community to 
make learning more intense and sticky.” 

When the student is ready, the 
teacher appears. I 
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Java EE 7 further embraces Web technologies 

Oracle preps specification to handle cloud, HTML and JSON 


BY ALEX HANDY 

Java EE 7 was officially released in June 
with additional support for HTML5 
and Web technologies, giving develop¬ 
ers a more robust, business-focused 
platform for their Java applications. 

The updated Java specification is the 
culmination of work on 14 active JSRs; 
chief among them was JSR 342, the 
umbrella specification for the platform. 
The main focus of Java EE 7 itself was 
to prepare the platform for the cloud, 
said Oracle. 

Linda DeMichiel, specification lead 
for the Java EE platform and an Oracle 
employee, said the major themes of 
Java EE 7 are focused on developer 
productivity. That means embracing 
popular Web technologies within the 
Java EE 7 stack. She highlighted the 
new support for JSON as an example. 

Said DeMichiel: “JSON is a key 
technology for data transfer within 


HTML 5 apps. With Java EE 7, we add 
new APIs to enable parsing and gener¬ 
ation of JSON.” 

These new JSON capabilities go 
hand in hand with the new Web Socket 
support. The Java API for WebSocket 
1.0, said DeMichiel, “enables highly 
efficient communication between the 
client and server over a single TCP con¬ 
nection, where the connection is held 
for the entire session.” 

Java Server Faces was also updated. 
In this release, HTML5 can be passed 
through JSF directly to the browser, 
eliminating the potential for JSF to block 
new markup attributes that it hadn’t yet 
supported. 

Other mainstays of the Java EE plat¬ 
form were updated. Java Message Server 
(JMS) 2.0 includes a new programming 
model. Additionally, a new mandatory 
API allows any JMS provider to be inte¬ 
grated with any Java EE container. 


Numerous changes to Java EE 7 have 
removed many inconsistencies between 
the various bean types and the Java EE 
Web layer. These allow Enterprise Java 
Beans 2.2 and Bean Validation 1.1 to 
exist in cloud-based host environments, 
without the potential for conflicts with 
JSF and the API for using REST in Java. 

New in Java EE 7 is JSR 352: Batch 
Applications for the Java Platform. This 
includes a new programming model for 
batch applications, as well as a runtime 
for scheduling and executing jobs. 

Harish Grama, vice president of Web¬ 
Sphere product management and devel¬ 
opment at IBM, said that Java EE 7 
accomplished its goal of increasing devel¬ 
oper productivity. 'With Java EE, you 
truly get all the qualities of service any 
enterprise application would need. 
We’ve made it easier for programmers to 
consume the complex, underlying capa¬ 
bilities of the platform,” he said. I 
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Getting development testing 
into a new maturity model 


Development Testing Maturity Model 
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D evelojimenl Testing Meptian 


The Development Testing Maturity Model defines discrete levels of testing progress. 

not. You still need to do them,” said 


BY SUZANNE KATTAU 

To help companies integrate testing 
into their DevOps cycle, tool provider 
Coverity has announced the Develop¬ 
ment Testing Maturity Model, a guide 
for implementing testing best practices. 

Coverity’s Development Testing 
Maturity Model works in conjunction 
with the Coverity Development Test¬ 
ing Platform, but it isn’t solely for 
users of that platform. Rather, it’s a set 
of practices for organizations no mat¬ 
ter what testing platform they’re using. 
“Between our open-source experience 
as well as having worked with a variety 
of enterprise customers, we’ve come 
up with this best practices model that 
the whole industry can use,” said Jen¬ 
nifer Johnson, VP of marketing at 
Coverity. 

The Development Testing Maturity 
Model outlines a phased-in approach to 
development testing adoption. “We’re 
not telling people to just throw in a 
technology and figure it out or do 
everything all at once,” Johnson said. 

The maturity model helps compa¬ 
nies find quality and security issues as 
code is being developed. “We are not 
saying that we’re going to replace QA 
testing and security audits, absolutely 


Johnson. “But what this does is it 
helps remove a lot of the defects in 
the front end that, right now, QA 
testers and security auditors waste 
time trying to fix, like basic bug detec¬ 
tion and fixing.” 

This helps accelerate the DevOps 
process, Johnson said, because if you 
can fix the majority of defects in devel¬ 
opment before they ever get to a QA 
tester, then that tester can focus on val¬ 
idating that the code works and that it 


can scale, as well as do load testing. She 
said it also helps the security auditor to 
focus on making sure that the software 
meets compliance requirements. 

“You’ll have more reliability in what 
you’re actually pushing out into opera¬ 
tions. You’re going to have less of a 
troubleshooting escalation loop back to 
development once you’re in the field,” 
Johnson said. “This is because you’re 
helping everybody eliminate more 
defects so you get less out in the wild, 
so to speak.” I 


Five New Coverity Verification Services 

Coverity also announced five new code-verification services, 
which can work in conjunction with Coverity's Development 
Testing Maturity Model. 

Coverity Supply Chain Audit Service: In the Maturity Model, 
• Level Four is Code Governance, where you get your supply 
chain vendors to meet your policies. “This is kind of the same 
thing, but it's a point-in-time snapshot,'' said Jennifer Johnson, 
VP of marketing at Coverity. “So if you were earlier on in the 
security curve, like at Level One or Level Two, but you had a sup¬ 
plier you thought might be problematic, you could have this 
service done. We'll give you a report of what's in [their] code.'' 
Mergers and Acquisition Due Diligence Audit Service: This 
• is virtually the same service as the Coverity Supply Chain 
Audit Service, but for a different purpose. “If you're looking at 
acquiring a company and you want to understand the quality of 
their software before you decide to purchase them, we'll do a one¬ 
time scan and give you a report of the quality," Johnson said. 


3 Security Service: This is a service where Coverity looks at 
• the OWASP Top 10 and CWE Top 25 and tells you what kinds 
of automated technology you could bring into the development 
cycle to handle security issues relevant to your application. “Peo¬ 
ple will say you have to meet all OWASP Top 10 and CWE Top 25, 
but actually it depends on the application," Johnson said. 

Food & Drug Administration (FDA) Product Implementa¬ 
tion Validation Service: This service is for medical device 
companies. “As part of the FDA approval process, not only do 
they need to show that the quality of their software is high, but 
they need to use static analysis as part of the FDA approval 
process," Johnson said. 

MISRA Service: The Motor Industry Software Reliability 
• Association (MISRA) is a standard for quality for the auto¬ 
motive industry. “The MISRA Service is a custom deployment 
where we'll look at an organization's application and the pieces 
of MISRA that are applicable to them," Johnson said. “We'll 
develop customized checkers to find those types of issues on an 
automated basis." I —Suzanne Kattau 
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Bridging relational 
and NoSQL worlds 

lOgen, IBM combine on data access standard 


BY SUZANNE KATTAU 

lOgen and IBM recently announced 
they are collaborating on an open stan¬ 
dard that will enable mobile and Web 
application developers to access data 
held in both lOgen s MongoDB NoSQL 
database as well as in IBM’s DB2 rela¬ 
tional database. 

The open standard will be built 
upon MongoDB’s BSON (Binary 
JSON) protocol. While the two compa¬ 
nies are collaborating on the technolo¬ 
gy, they encourage others in the open- 
source community to adopt the 
standard. “Rather than just seeing this 
as lOgen and IBM collaborating on 
MongoDB, what we’re actually envi¬ 
sioning is building from that a vibrant 
community that includes companies 
beyond just us,” said Matt Asay, VP of 
business development and corporate 
strategy at lOgen. 

Angel Luis Diaz, VP of software 
standards, open source and cloud labs 
at IBM, said the MongoDB-DB2 work 
is but the first of IBM’s offerings to sup¬ 
port what he called the company’s 
“mobile-first approach” to application 
development. He said mobile develop¬ 
ers today have more pressure to deliver 
applications with enterprise-class securi¬ 
ty and access to data. 



lOgen's Matt Asay wants to build up a com¬ 
munity around MongoDB. 


“ ‘Concept count’ is an industry term 
describing how there are many things 
developers need to know in order to get 
applications up and running quickly,” 
said Diaz. “When you look at actually 
connecting the application you’re 
building to the database, the concept 
count is pretty high; there are lots of 
different protocols and lots of different 
ways of doing it. What we found really 
interesting is that, in the open-source 
community, a lot of the ecosystem has 
been gravitating toward the MongoDB 
Wire Protocol, which is the way of 
describing how you access that data. 

“What we’re announcing is the col¬ 
laboration on really two fronts. One is 
trying to get some standardization 
around how you access the database. 
You can now access a NoSQL Mon¬ 
goDB database the same way that you’d 
access an IBM DB2 relational database, 
so developers have one way of getting 
at information that sits within their 
enterprise in a relational database or 
within a NoSQL database. And that is 
great for the ecosystem because it 
reduces the concept count. 

“Secondly, we’re working with lOgen 
and the Mongo community to look at 
how we can bring other folks together 
to collaborate on this protocol and col¬ 
laborate on the Mongo database com¬ 
munity itself.” 

When asked if there are kinds of 
applications that work better using a 
NoSQL database or a relational data¬ 
base, lOgen’s Asay replied that it’s more 
about the type of query as opposed to 
how information is structured. “It’s not 
so much industry-split, it’s more com¬ 
puter science/technical-split,” he said. 

“And it’s not just databases; it could 
be in-memory caches or data grids... 
The real advantage, again going back to 
‘concept count,’ is it’s making it easier 
for folks to get access to the information 
they need when building their app.” 


A NoSQL addition to 
Windows environments 

FatCloud has released a new manage¬ 
ment studio for its cloud database that 
the company said eases bringing 
NoSQL into Windows environments. 

FatCloud in June introduced FatDB 
Management Studio, which CEO Ian 
Miller said looks like SQL Server Manag¬ 
er to simplify the bridging of the rela¬ 
tional and NoSQL worlds. “Before Studio, 
everything had to be done through code. 
There was no visualization,” he said. 

The solution starts with the FatDB 
NoSQL database, and layers on top an 
asynchronous work queue, a distributed 
file-management system, Map/Reduce 
capabilities, caching, and app services. 

“Having a SOA has challenges in 
throughput,” Miller explained. “Instead 
of moving data to processing, we want 
to move processing to the data, to 
achieve better throughput by blending 
data and processing." 

The management studio can plug 
into Visual Studio and can cache both 
SQL and FatDB data. It runs on multiple 
Infrastructure-as-a-Service providers, 
or can run in a data center on Windows 
Server, Miller said. 

"People want the cloud to be a sand¬ 
box," he added. "They want to prototype 
in the cloud but then bring it in-house." 
So solutions like FatDB, which he 
described as a document store and a key- 
value store, need to be flexible in building 
and deploying apps. 

For now, FatCloud is focusing on 
Windows companies. “Most companies 
using NoSQL are Linux/open source. Of 
150 companies using it, 130 are like 
that. We want to be the best among the 
20," Miller said. I —David Rubinstein 

Diaz said that developers using IBM 
Worklight Studio can access the free, 
open-source version of MongoDB 
without having to pay extra for the inte¬ 
gration; they can get MongoDB access 
in a Worklight Studio update, he said. 

In order to access the subscription 
version of MongoDB, current users of 
IBM’s DB2 will not incur an extra 
charge for the integration between the 
products either, Asay said. They would 
simply need to pay for any licensed 
software they use. I 
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Competing for solutions is the Haggle way 

Crowdsourcing site uses cash and glory to tackle data problems 


BY CHRIS BARYLICK 

If you’re working on a data project and 
find yourself stumped, call on legions of 
professionals to solve your problem, 
even as they compete for cash and pro¬ 
fessional exposure along the way 

This is the philosophy of San Francis¬ 
co-based Kaggle, and with more than 
US$11 million in funding and a network 
of more than 94,000 assorted data scien¬ 
tists worldwide, it seems to be 
working. The premise is simple: 
Organizations present problems 
to the Kaggle community, outline 
the rules, and back it with a cash 
prize ranging from several hun¬ 
dred to $250,000 for commercial 
competitions. Then there’s the 
ongoing big jackpot: a $3 million 
Heritage Health Prize to create 
an algorithm to help people avoid 
unnecessary hospitalization. 

Once a competition has been 
created, it is crowdsourced with 
users of all types (scientists, engi¬ 
neers, developers, marketers or 
mathematicians) divvying up tasks 
and submitting a working model 
as a potential solution. Along with 
the cash and a cool new tag for 
their resumes, the winning team 
also generally gets a chance to 
speak with the engineering and/or 
management ends of the competition’s 
sponsor company. And the sponsor usu¬ 
ally enjoys the fresh perspective brought 
to them by the winning team. 

After a competition is finished, the 
winning team is tasked with writing up 
a performance report, and the competi¬ 
tion’s sponsor chooses whether to make 
the winning solution’s code exclusive, 
open source, or shared in any way. 

In some of the site’s more unusual 
competitions, data-analysis teams are 
working on the best model for identify¬ 
ing bird species from continuous audio 
recordings, while others are working on a 
means of keeping whales from colliding 
with transatlantic ships. More real-world 
problems are reflected in the $250,000 


competition for an algorithm to help bet¬ 
ter predict airline flight delays. 

The idea of gathering talented minds 
to take on complex, Big Data problems is 
as old as think tanks, but Kaggle seems to 
have taken a different, almost social-net- 
working approach to it. Kaggle was 
founded by Anthony Goldbloom, a Uni¬ 
versity of Melbourne graduate with a 
degree in economics and econometrics. 


He spent time in the economic modeling 
unit for Australia’s Department of the 
Treasury before working for the Reserve 
Bank of Australia. During an internship 
with “The Economist” magazine, he 
found that a number of large-company 
CIOs stated that while access to data was 
undeniably critical, the challenge was in 
finding talent to produce and work with 
the numbers to find the best solutions for 
large-scale problems. 

Thus, the effort to make it easy for 
companies with Big Data problems and 
an army of data analysts to find each 
other was born. 

Where some might question why 
some of the world’s best minds would 
work on extracurricular projects, presi¬ 


dent and chief scientist Jeremy 
Howard—who also was the first partic¬ 
ipant to win multiple competitions— 
was quick to point out that the compe¬ 
titions represented a great means of 
developing new skills and sharpening 
old ones. 

“Kaggle is a great place to develop 
skills by using them, as opposed to hear¬ 
ing about them in an educational set¬ 
ting,” he said. “We think people 
come to learn, to have fun, to 
develop a professional reputa¬ 
tion, and then the prize money.” 

Still, it may be the competitive 
element that seals the deal for 
many participants. Tapping into a 
social network ideal, Kaggle 
allows users to readily form teams 
to attack problems together. This 
is enhanced by real-time, contin¬ 
uously updated leaderboards for 
each competition. The Kaggle 
website also shows which teams 
and individuals are currently 
leading the pack and by how 
much. The final score is tallied by 
Kaggle’s proprietary metrics, 
which account for the level of 
accuracy between the problem 
and the solutions submitted. 

According to Howard, a large 
number of solutions that win com¬ 
petitions become open-sourced and used 
later on for different projects. For mar¬ 
quee-level Kaggle competitions, the 
companies sponsoring the competitions 
typically express an interest in owning 
the intellectual property developed for a 
contest. 

One such property came from a com¬ 
petition to observe dark matter in the 
universe. Winning solutions were con¬ 
verted to open-source code, including 
one written for an IPython notebook. 

Kaggle also offers an elite program 
with Connect, where individuals and 
teams that have won multiple contests 
are given access to private, higher-end 
contests with prizes that Howard 
described as “quite lucrative.” I 













28 , NEWS 


I SD Times | July 2013 | www.sdtimes.com | 



Touch controls, cloud sync 
come to DevCraft .NET toolkit 


BY SUZANNE KATTAU 

UI component and devel¬ 
opment tool provider 
Telerik recently released 
DevCraft Q2 2013, an 
updated .NET develop¬ 
ment tool set that contains 
touch-friendly UI con¬ 
trols, cloud data synchro¬ 
nization, and cloud mobile 
Backend-as-a-Service sup¬ 
port. Other updates 
include a new 
Calendar control, Type- 
Script definitions, sample 
libraries of Windows 8 and 
Windows Phone 8 applica¬ 
tions, and a set of Windows 
Phone 8 design templates. 

Previously, Telerik 
released RadDataGrid for 
Windows 8, XAML and HTML. Now, 
the company has updated RadDataGrid 
to include inline editing. 

“TypeScript is an object-oriented 
version of JavaScript,” said Phil Japikse, 
Evangelism Lead for DevTools at 
Telerik. “TypeScript allows traditional 
object-oriented developers to write com¬ 
pliant JavaScript code in a way they feel 


comfortable.” 

Japikse said the com¬ 
pany has added a set of 
Windows Phone 8 design 
templates so that devel¬ 
opers can add charts or 
data-entry forms into 
their applications. “We 
have added design tem¬ 
plates that give develop¬ 
ers, for example, open- 
high-low-close (OHLC) 
charts right in their appli¬ 
cation,” he said. 

The Cloud Data Sync 
feature lets developers 
build Windows Phone 8 
applications that work 
online and offline. 
Telerik has also added 
Data Storage for Win¬ 
dows 8, which Japikse said lets devel¬ 
opers store any data they have in their 
app locally through a SQL-compliant 
database. 

“We use SQLite,” he said. “Then 
when they connect, they can simply 
program their synchronization to get 
that data up into their cloud, wherever 
that may be.” I 



DevCraft's RadControls for 
Windows Phone consists of 
various UI controls and MBaaS 
support. 


In other component news... 

Database connectivity solution 
provider Devart recently released new 
versions of its Delphi Data Access Com¬ 
ponents with support for iOS and the 
NextGen compiler. The enhanced com¬ 
ponents let iOS developers establish a 
direct connection to InterBase ToGo, 
MySQL, Oracle, PostgreSQL and SQLite 
databases. With support for the 
NextGen compiler, the components can 
be used when developing database 
applications using automatic reference 
counting. Devart has also recently 
released new versions of its dbExpress 
drivers, adding support for Embar- 
cadero's RAD Studio XE4. 

Dynamsoft, a developer of scanner 
programming libraries and JavaScript 
webcam plug-ins, has upgraded the bar¬ 
code reader add-in for its latest Dynam¬ 
ic .NET TWAIN SDK version 4.3. Its ID or 
2D barcode reader performance and its 
barcode recognition have been 
improved. Also, the SDK allows develop¬ 
ers to use zone optical character recog¬ 
nition (OCR) on scanned documents. 

PDF document-management solution 
provider soft Xpansion recently released 
its PDF Xpansion Reader for Windows RT. 
On all ARM-based Windows RT devices, 
PDF Xpansion Reader lets developers 
view and print PDF files, read comments, 
fill and print PDF forms, or search 
through PDFs. I 


DevExpress adds controls to .NET dev tool 


BY SUZANNE KATTAU 

DevExpress recently released Dev- 
Express Universal 13.1, an updated 
version of its integrated .NET soft¬ 
ware development tool set. .NET 
developers can create Windows, Web 
and mobile applications using the new 
controls and features across the Win- 
Forms, ASP.NET, MVC, WPF, Sil- 
verlight and Windows 8 XAML tool 
sets. 


DevExpress Universal’s WinForms 
tool set contains new spreadsheet and 
map controls, data editors (including 
sparkline and drop-down tree list), an 
icon library, a Windows 8 Live Tile 
manager, and a touch-optimized skin. 
The updated ASP.NET and MVC tool 
set gives Web developers paging sup¬ 
port for touch-centric applications, an 
ASP.NET image gallery, an ASP.NET 
and MVC application theme, an MVC 


image slider and file manager, and an 
MVC Captcha feature. The tool set 
now also supports SharePoint 2013. 

The WPF tool set now contains a 
chart wizard, a property grid control, 
Windows 8-inspired controls, and a 
Visual Studio application template 
gallery. The updated WPF and Sil- 
verlight tool sets both contain a new 
banded-grid view, and improved map 
and range controls. I 
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Failure 
to Test 



The key to mobile success is testing. 
So why do fewer than 
10% of developers do it? 


BY ALEXANDRA WEBER MORALES 


D evOps is a powerful metaphor. 
We think there needs to be 
another one for mobile: 
DevTestOps, or DTO, as we’re calling 
it. Otherwise, you’ll be provisioning 
very quickly and agilely, but your cus¬ 
tomers will go away and you’ll never see 
them again,” said Tom Lounibos, CEO 
of SOASTA, a cloud testing company 
that, along with much of its competi¬ 
tion, is moving nimbly into the mobile 
testing space as a diverse new market 
for app-testing tools begins to form. 

The need for quality control as mobile 
apps proliferate is obvious, but develop¬ 
ers aren’t getting the message yet. In his 
keynote at the first-ever Xamarin user 
conference in April, CEO Nat Friedman 
said that a survey by his company found 
only 8% of developers were testing their 
apps prior to release. 

Is that shocking figure an exaggera¬ 
tion? New York City-based senior soft¬ 
ware engineer Greg Shackles doesn’t 
think so. About half the developers at 
his mobile testing talk (also at the Xam¬ 
arin conference) said they were testing, 
“but that was a self-selected audience 
of people interested in testing in the 
first place. I’d definitely agree with 


the problems with testing mobile is that 
it’s a pretty difficult thing to test.” 

'This is not the Web' 

There are many reasons why mobile 
apps are different from Web software. 
The way app stores (there are now 
more than 70, according to Mobyaffili- 
ates’ directory) funnel approved exe¬ 
cutables to consumers’ tablets and 
smartphones is a big one. 

Former Web developer Arlo Leach 
launched Set List Maker, a show-plan¬ 
ning tool for musicians, for iOS after 
teaching himself Objective-C. He whole¬ 
heartedly believes that app market¬ 
places make it easier to sell software: 
“It’s a pretty unique opportunity. I built 
a lot of products on the Web and they 
just sat there, and I felt powerless to 
spread the word. I’ve done no promotion 


Xamarin’s saying 8% over 50%. One of for these apps and they found an audi¬ 


ence on their own in the App Store.” 

The ease with which highly functional 
apps that meet a specific need can find 
success is exceeded, however, by the risk 
posed by poorly tested apps that garner 
bad ratings. “The problem has been that 
companies are putting mobile apps out 
with minimal testing, and that’s increas¬ 
ingly coming back to bite them. Failure 
is extremely visible and costly,” said IDC 
analyst Melinda Ballou. She plans to 
write a mobile-testing market report 
soon, but said the current market is 
“eclectic,” with major players just begin¬ 
ning to announce tools and strategies. 

There are bigger risks than just being 
sunk by bad ratings. Buggy, slow or 
downright destructive apps (see “Hello 
Fail”) can be hard to fix once released. 
“This is not the Web,” said Shackles, 
who works at OLO Online Ordering. 

continued on page 32 ► 
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◄ continued from page 31 

“I’m a Web developer on top of being a 
mobile developer. With the Web, if you 
push out a bug, you can fix that, and the 
next time the user hits that page, its 
fixed. With mobile apps, you have to go 
through the submission process. In 
some cases you can expedite it, but even 
then you still have to wait for users to 
update. There’s a very big barrier to get¬ 
ting emergency fixes out to users.” 

Further, unlike cloud-based soft¬ 
ware, apps, be they native or hybrid or 
even HTML5-based, are dependent on 
device-specific form factors, browsers 
and operating systems. Google allows 
hardware manufacturers to customize 
the Android OS. The platforms them¬ 
selves are changing constantly, and at a 
pace unheard of in the PC era, where 
OS upgrades happened every five years 
or so. (As an example, Apple released 
iOS 6 less than a year after iOS 5.) 

Apps gain privileged 
access to APIs, storage and 
other resources, unlike Web 
applications, according to the 
creator of Localytics, a mofaij 
analytics tool. Apps are designed 
less for passive information con¬ 
sumption, revolving primarily 
around actions and multi-tasking. 

Finally, there’s the aesthetic ques¬ 
tion. As Xamarin’s Friedman said at his 
user conference, enterprise apps are 
starting to “suck less” thanks to the con- 
sumerization of IT. There’s a newfound 
expectation that apps won’t just function, 
they’ll be beautiful and immersive. 

Don't be greedy 

Mobile app beauty can’t remain skin 
deep, either. A major concern, often 
foreign to Web developers, is energy 
consumption. Batteries get sucked dry 
thanks to programming errors such as: 

• Automatically starting the app when 
the device powers on, or having it run 
in the background when not in use 

• Running wild with energy-hogging 
functions such as GPS, camera, 
accelerometer and other device sensors 

• Uploading user information and 
downloading ads 

• Not allowing the device to go into 
sleep mode 


Data usage, privacy and security are 
also concerns that many ignore, at 
users’ peril. “We continuously see 
enterprises not completely aware and 
educated with respect to the type of 
vulnerabilities mobile apps have. 
They’re more focused on coding the 
app and pushing it over to production,” 
said Bala Venkat, chief marketing offi¬ 
cer for Web-security firm Cenzic, who 
claimed 15% to 30% of its new cus¬ 
tomers are developing mobile apps. 

Tablets and phones can’t be network 
hogs, either, both for economic reasons 
(not incurring data charges, for example) 
and because they depend on intermit¬ 
tent wi-fi and spotty cellular networks. 

According to Rebecca Clinard a 
mobile tester blogging for Neotys (a 


‘Companies put mobile apps 
out with minimal testing, and 
that's coming back to bite them. 1 

—Melinda Ballou , I DC 


maker of mobile load and stress¬ 
testing tools), apps experience “delays in 
response times, which in turn affect the 
duration that ports or sockets are kept 
open—an environmental resource usage 
that is frequently seen with mobile 
applications. It’s because of this variable 
network connectivity that the user expe¬ 
rience isn’t always an absolute known.” 

From the developer perspective, 
these constraints mean they should 
reduce network dependencies. Tech¬ 
niques include “decreasing embedded 
requests, using local storage on the 
device for caching static files, enabling 
transfer compression, avoiding redirects, 
minimizing data content size, reducing 
the number and length of cookies, 
removing lint from code (white spaces 
and comments), organizing the delivery 
for incremental rendering, aggregating 
requests, and using PUSH behaviors. 
Creating lighter-weight mobile applica¬ 
tions allows the overall end-user experi¬ 


ence to be less dependent on the device 
network vulnerabilities,” wrote Clinard. 

Stay safe 

The App Quality Alliance (AQuA) Best 
Practice Guidelines is a helpful com¬ 
pendium of critical practices. It recom¬ 
mended that developers “use HTTP 1.1 
with pipelining. HTTP pipelining paral¬ 
lelizes HTTP requests within a TCP 
session. It’s perfect for improving user 
experience in high-latency environ¬ 
ments, but it’s important to note that 
HTTP pipelining requires implementa¬ 
tion on both client and server.” 

Above all, the computers we carry in 
our pockets are communication devices. 
As such, they must be able to make and 
receive emergency calls, as well as han¬ 
dle other interruptions, such as sending 
or receiving SMS and or MMS messages. 
As AQuA noted: “The user must be 
able to accept an incoming phone call 
Yvhile the application is running. It 
should then be possible to resume 
errthe same point in the applica¬ 
tion at the end of the call, or a logi¬ 
cal restarting point.” 

Oh, and don’t melt the 
mne. That’s the advice of 
mobife testing expert JeanAnn 
Harrison, who said that mobile 
devices in heavy usage (especially while 
being simultaneously charged) have 
temperature limits and shut off auto¬ 
matically before their components fuse 
into a molten mess. 

The mobile testing tool stack 

Once developers realize what’s at stake 
in this booming mobile app economy, 
they’re halfway home. The next step is 
to recognize that the device emulators 
that come with most development envi¬ 
ronments won’t cut it, as far as testing 
goes. (“No matter what you’re doing, 
simulator or emulator testing is not 
enough,” said Shackles.) A burgeoning 
array of tools is available to exercise 
apps at every level, and outsourcing the 
whole affair is always an option. 

Consumer-oriented companies such 
as Google, Microsoft, HBO, Amazon 
and USA TODAY are already fans of 
uTest, for example, which in addition to 
automated tools for functional, usabili- 
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Cross-platform: The third wave 

Apple aimed for beautiful user interfaces. Android aimed for ubiquity. At his company's first 
user conference held in Austin in April, Xamarin CEO Nat Friedman made the case for 
delighting developers as part of a third wave of mobile application technology. "Who was 
focusing on the developer? We decided that that's what we were going to do. We were 
going to focus on delighting developers every step of the way toward their app," he said. 

Ctt-based Xamarin is the latest company to move into cross-platform app creation, 
joining Appcelerator's JavaScript-based Titanium; FireMonkey's Delphi approach; 
RubyMotion; and a Java-based approach from Oracle, among others. 

So is there a need for better developer experiences like what Xamarin is promis¬ 
ing? "In my opinion, absolutely,” said Greg Shackles, a mobile developer with OLO 
Online Ordering. "Historically, Android tools are not that good. They use the Eclipse 
IDE. Even the language-Java-hasn't evolved in a long time. Then, for Objective-C, you 
have an Internet full of people who have problems picking up that language. 

"Language-wise, Xamarin is right on top of the new shiny stuff. It's leagues better 
than Objective-C and Java. You plug into Visual Studio, or use Xamarin Studio's free 
IDE. It's still maturing, but better than what's out there.” I 

—Alexandra Weber Morales 


ty, security, localization and load testing 
of mobile apps, boasts 80,000 profes¬ 
sional testers from 190 countries. 

Partnering with established testing 
leaders and development platforms is 
all the rage. Joining with market leader 
HP QuickTest Professional is Perfecto 
Mobile, for example. “As mobile appli¬ 
cations become business-critical, enter¬ 
prises need automated mobile-testing 
solutions that plug into their existing 
quality processes,” said Eran Yaniv, 
CEO at Perfecto Mobile. “It is our mis¬ 
sion to address these challenges with 
MobileCloud for QTP” 

Also worth mentioning is Silk for 
Mobile, launched last year, which 
records test scripts for iOS, BlackBerry, 
Android, Symbian and Windows 
Mobile devices. 

A commonly used tool among mobile 
developers is TestFlight. Integrated into 
Xamarin Studio and Visual Studio as 
well as Android tools, this free tool dis¬ 
tributes builds to testers and distribution 
lists, as well as helps manage the feed¬ 
back from manual and automated tests. 

In the same vein, Google Play 


Developer Console now lets developers 
select a small test group from Google 
Groups and Google + to alpha-test apps. 
Apps that pass muster can be escalated 
to beta through to launch. 

Testing for gestures 

“Compilation is only the first unit test,” 
said Shackles. While most testers argue 


that a significant amount of mobile test¬ 
ing must always be done manually, 
there are a variety of tools out for user- 
interface and acceptance-test automa¬ 
tion. With a history of .NET GUI test¬ 
ing, Ranorex has expanded its 
framework to cover Android, iOS and 
Windows Phone apps. 

continued on page 34 ► 
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San Francisco-based LessPainful 
(recently acquired by cross-platform 
mobile development company Xamarin) 
offers tools similar to Selenium Web- 
Driver but optimized for the “vastly dif¬ 
ferent” mobile space. Its open-source 
calabash-android and calabash-ios are 
low-level automation and testing libraries 
for Android and iOS. These enable test 
code written in Cucumber to interact 
with native and hybrid apps, exercising 
end-user actions such as touches or ges¬ 
tures, making assertions and capturing 
screenshots on devices. 

% An interesting feature of 
Cucumber is that it uses natural 
language that non-developers 
can understand, like the following: 

When I touch the "ratejt" button 
Then I should see the rating panel 

SOASTA TouchTest is another option 
for automating mobile application test¬ 
ing, and gestures will only become more 
important over time, according to Louni- 
bos. “Every mobile app is built on ges¬ 
tures such as swiping to unlock the 
screen,” he said. “But we’ll start to go to 
more advanced gestures in the next few 
years, like hovering.” 

But testing gestures requires a special 
approach, said Lounibos. “If you use an 
optical reader to capture the static view 
on the device, gestures can break the 
test. They become a pain in the butt,” he 
said. “Our engineering team started 
looking at it deeply and took a very cre¬ 
ative approach. We embed a library into 
the object repository, testing inside the 
app. We capture all forms of movement, 
with precision. You can incorporate 
TouchTest into the daily build with Jenk¬ 
ins or Bamboo. Appcelerator uses it for 
their Titanium platform. Five-hundred 
thousand Titanium mobile developers 
are using TouchTest.” 

The tool can automate more than 
60 hours of manual testing in a single 
night, Lounibos claimed. 

It's the device, stupid 

Fragmentation of the device market 
continues apace, and testing for every 
possible permutation bedevils even 
mobile development platform makers. 


Exadel vice president of engineering 
Dmitry Binunsky has been looking for a 
partner that could work with his com¬ 
pany’s cloud-based app development 
tool, Appery.io. 

“That is a little bit of a challenge, cur¬ 
rently,” he said. “If you are on the cloud 
and not a device, we have a mobile emu¬ 
lator that can test all the features, except 
those on the actual phone. The chal¬ 
lenge there is that all devices are differ¬ 
ent. There might be a card reader, or 
there’s no camera. There are QA compa¬ 
nies that make a device list available on 
the Internet and charge you by the hour. 
We are talking to potential partners to 
simplify the process for our customers of 
testing on the device.” 

Perfecto Mobile’s MobileCloud is one 
of an increasingly popular breed of serv¬ 
ice that essentially rents “a multitude of 
real mobile handsets and tablets con¬ 
nected to live mobile networks spread in 
different geo-locations,” according to the 
company’s marketing literature. 

Another early investor in the mobile 
testing space is Keynote, which in 2011 
acquired DeviceAnywhere and its cus¬ 
tomer base of around 1,000, according 
to an IDC report. 

These companies have just been 
joined by Xamarin, which announced 
Test Cloud in April (based on technology 
acquired from LessPainful, now in beta). 
With a UI that reflects the company’s 
mobile focus, the test cloud, in conjunc¬ 
tion with Calabash and Cucumber, will 
allow automated testing on real devices. 

“We tell all our customers you’ve got 
to start running on real hardware from 
day one,” said Friedman at the Xamarin 
user conference. “The reason we say 
that now is because we’ve seen what 
happens if you don’t. You spend months 



developing your app in a simulator, you 
get ready to release, and you find out, 
‘Oh, there’s a memory limit.’” 

“I’m anxiously awaiting my invite to 


try it out,” said Xamarin fan Shackles. “I 
think it’s going to solve a huge problem. 
You’ve got a handful of iOS devices you 
need to test against, but on the Android 
side, there are thousands of different 
devices, screen sizes, and Samsung, 
Motorola, and the others all provide 
their own layer of modifications on top 
of Android.” 

Perform quickly, think globally 

Once device concerns have been 
assuaged, it’s time to think about per¬ 
formance, advised Wayne Ariola, VP of 
strategy for Parasoft, which has a 25- 
year history in the code-quality and 
testing business. “Too often companies 
believe that functional testing alone is a 
replacement for truly understanding 
how it’s going to perform in the wild,” 
he said. “Since there is a massive rush 
to market, enterprises are not taking 
time to remediate performance. The 
first thing they need to be doing much 
better is designing mobile apps for spe¬ 
cific geo-location.” 

Parasoft’s service virtualization can 
help, Ariola claimed. Parasoft Virtualize 
allows developers to “simulate how 
things will perform given bandwidth, jit¬ 
ter, packet loss. Our service-virtualization 
environment can simulate all that chaos. 
No matter what mobile phone you have, 
when you push a button, it’s sending 
something somewhere to make a 
request, then waiting for a response. 
Your phone has [communication] proto¬ 
cols that it’s managing: SMS, telephone, 
JSON calls or REST-based calls. All that’s 
being sent over telco or wi-fi network for 
app to perform. Unless you have a game, 
anything transactional requires a connec¬ 
tion to the outside world.” 

Neotys NeoLoad is another mobile 
solution that gives you the ability to load- 
test both Web and native applications on 
mobile devices under real net¬ 
work conditions. SOASTA, 
too, offers this type of load test¬ 
ing as part of its bread-and-butter 
CloudTest platform. Testers can run 
browsers and real mobile UI testing 
simultaneously with protocol-level load 
testing. The company claims differentia¬ 
tion around its use of real users in addi¬ 
tion to virtual users. SOASTA can also 
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Hello, Fail 

A gallery of apps behaving badly 


A leader in the mobile testing space, uTest 
has done an excellent job of highlighting 
what can go wrong on smartphones and 
tablets. “There's no way to hide poor 
mobile app quality in the era of social," the 
company quotes Michael Croghan, mobile 
solutions architect for USA TODAY (and 
uTest customer), in its e-book on the topic. 

Indeed, pictures of app fails are legion 
on the Web. "Here's a picture of what hap¬ 
pened to me when I was in the airport try- 
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This potentially disastrous Kindle for iOS 
update could not be remedied right away. 


ing to change my seat on Southwest," said 
Wayne Ariola, VP of strategy for Parasoft. 
"If my mother got this message, she'd say, 
'What's wrong with JSON?' The middle¬ 
ware, which we simulate, returned an 
invalid JSON string. What happened here is 
it was a concurrency issue. Two services 
shot out from the app when I made a 
request. It was a race condition, essentially." 

Software engineer Greg Shackles sited 
the well-publicized example of Amazon's 
Kindle for iOS update. "They released an 
update that, when you ran it, the user's 
Kindle media on the device was wiped 
out," he said. "The best thing they could 
do is update the release notes for the app 
in the app store. I know on Android there 
are ways to retract an update. That would 
partially solve the problem." 

Then there are regulatory questions 
that are sure to become more pressing 
over time. In May, the Food and Drug 
Administration warned a mobile app 
developer for neglecting to obtain 510(k) 
medical device clearance for the uChek 
Urine Analyzer app. The FDA argues that 



A concurrency issue generated this confusing 
error message for Southwest Airlines. 

using the phone camera to read urine 
dipsticks meant "the phone and device 
as a whole functions as an automated 
strip reader" that required clearance as 
a "urinalysis test system." 

Finally, consumers are well aware that 
certain apps are battery-hungry, given 
the obvious and frustrating symptoms. 
Purdue University researchers found that 
no-sleep energy bugs can suck the elec¬ 
trons out of inactive handsets in five 
hours. Popular offenders included Google 
Maps, Facebook and K-9 Mail. I 

—Alexandra Weber Morales 


track browser-side metrics and on-device 
battery and memory performance. 

Analyze this! 

The last piece in the app puzzle is ana¬ 
lytics, so that companies can learn from 
mistakes and customer behavior. Thanks 
to its analytics tool mPulse, “We’ve seen 
process compression. Developers are 
beginning to watch these screens... They 
want to understand why they’re seeing 
red. Everything is real time,” said SOAS- 
TA’s Lounibos. 

Along with bandwidth and site met¬ 
rics, and page or native app load time, 
mPulse tracks engagement via bounce, 
exit and conversion rates. mPulse also 
notes user location, device type, carrier 
speed and application usage. 

San Francisco-based Crittercism 
claims to be the world’s first mobile 
application performance management 
platform, with dashboards of real-time 
metrics as well as debugging bread¬ 
crumbs to help discover the causes of 


crashes or handled exceptions. Applause 
is another mobile app analytics service, 
launched this year by uTest. There’s also 
the aforementioned Localytics. 

Another option is Flurry Analytics, 
launched in 2008 and boasting funnel 
analysis, custom segmentation and con¬ 
version tracking, audience demographic 
estimates, benchmark comparisons, and 
cross-application funnels. Google Ana¬ 
lytics for Mobile Apps SDKs is also avail¬ 
able for Android developers, of course. 

Hands still needed 

Ultimately, devices that are carried in 
pockets, touched, spoken to in myriad 
ways, pressed to ears, and—soon 
enough—worn continuously on the 
face or other parts of the body, will 
need more manual testing than tradi¬ 
tional Web and desktop software. 
According to Harrison, manual is still 
best, not to mention most economical, 
for tests of trainability, configuration, 
performance and usability. 


Even Shackles, a testing expert, 
doesn’t yet use any virtual mobile lab 
testing. Like many developers, he tests 
on a handful of devices manually. “It’s 
a relatively new space, and there are a 
lot of extremely hard problems to 
solve,” he said. “If you have a farm of 
devices to run tests in, even the infra¬ 
structure side is difficult. I try to put as 
much into unit-testable layers, 
because the more the surface area of 
your code is covered by automated 
unit tests, the more when you run on 
the device, you can focus on UI.” 

It all comes back to DevTestOps, 
said Lounibos. “In mobile, you don’t 
have minutes to repair a problem. You 
have to do testing and some level of 
predictive analytics before you go live, 
or you risk not being in business very 
long. We all got caught up in the cool 
things we could do, and we forgot about 
quality. That’s not unusual for the 
beginning of a marketplace.” I 

□ Find this story at http://sdt.bz/60783 
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Ague is a 
team effort 




Micros 


Microsoft focuses on helping developers while 
bringing the message of agility upstream 


T he early part of the 21st century 
saw the rise of agile software 
development. It was quite pre¬ 
scriptive in nature, almost to the point 
of being religiously dogmatic. “Indi¬ 
viduals and interactions over processes 
and tools,” the original manifesto 
declared. “Working software over 
comprehensive documentation.” 
“Responding to change over following 
a plan.” 

If you’re not doing pair program¬ 
ming, or Scrum, or Extreme Program¬ 
ming, then you’re not agile, the pundits 
cried. It was all or nothing. 

And now, some 12 years after the 


BY DAVID RUBINSTEIN 

manifesto was written, agile has crossed 
the chasm, going from a grassroots 
movement into the mainstream. And 
today, organizations are looking for 
maturity in their agile development 
processes. 

But the question is, what represents 
agile maturity? How do organizations 
know how far down the agile road 
they’ve gone? Several organizations 
have released maturity models for agile 
adoption, and they mostly agree that 
agile maturity means scaling the 
process beyond a team level. 


Microsoft is focusing on the ability 
to scale agile to all stakeholders of a 
software project, but without dictating 
how to do that. It’s what the company is 
calling “agile your way.” 

Agile's up above it 

Aaron Bjork, project manager on 
Microsoft’s Team Foundation Server 
team, said, “I think we’re entering a 
period where teams want agile maturi¬ 
ty, but I think agile maturity comes 
with scaling it, and you can no longer 
just be a team within an organization 
and say, ‘Well, we’re an agile team and 
continued on page 38 ► 
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Agile without unit tests is just plain 'reckless' 

While card walls, user stories and continuous integration are pillars of agile develop¬ 
ment, so is test-driven development. While it can be a fairly difficult process to adopt, 
mature agile practitioners should be doing it, said VersionOne CTO's Ian Culling. 

“If you're running iteratively, and you constantly deliver code on a story without 
testing, you're running a deficit and building up technical debt," he said. "Your deliv¬ 
ery of stories will slow over time" as defects take longer to find and correct. 

Eli Lopian, CEO of testing software company Typemock, took that thought a step 
further. Those not doing unit testing “are reckless agile prac¬ 
titioners," he said. “They go for the speed but get caught at 
the end." 

The problem, he continued, is that “people are trying to 
become Olympic swimmers without being healthy." While 
those teams see unit testing as slowing down the release 
cycle, he claimed just the opposite is true. “Code that has 
unit tests has fewer bugs, which reduces the QA effort," 
which gets quality, working software out the door more 
quickly. 

Microsoft offers unit-testing tools in Visual Studio, Lopian 
said, but he noted that Typemock integrates both with Visual 
Studio and Team Foundation Server to add features beyond 
those that Microsoft offers, and to provide a framework for 
unit testing that lowers barriers to becoming agile. 

Typemock offers a visual-coverage tool so users can see which pieces of code are 
covered by unit tests and which aren't, Lopian said. In the future, Typemock will offer 
one-button unit-test generation for code that currently isn't covered by tests, he 
said. "Putting working software above all, putting communication above other 
things, that's still how people are defining agile," Lopian said. I 

—David Rubinstein 



Unit tests are good for 
speed, says Eli Lopian. 


◄ continued from page 37 

we’re successful,’ because now you’ve 
got 20 agile teams, or 10 agile teams, 
and success is not defined by a single 
agile team being successful, it’s 
defined as all of us doing this and 
being successful, so I think that’s the 
new challenge.” 

Bjork described agile at the team 
level as science, but scaling it up the 
enterprise is more of an art. “I’ve often 
described it, as at the team level, 
there’s a lot of science you can apply to 
agile. I think Scrum is a science. Here’s 
how you do it. There’s a formula. You 
break it up into iterations, you have a 
backlog, you have roles you can apply 
to it,” he said. 

“People can grab it and accept it and 
follow the prescriptive guidance, the 
science if you will, and be successful. 
Above that, I think there’s a lot more art 
that has to take place, because as you 
move up in an organization, and you’re 
trying to make stakeholders happy, try¬ 
ing to make leadership teams happy, 
you’re dealing with personalities, you’re 
dealing with people that have strong 
opinions, that have built their careers 
maybe doing things a different way, so I 
think maybe you have to apply an art 
form to it, and I think that’s sort of 
where ‘agile your way’ or ‘agile on your 
terms’ is coming out. 

“I think making agile successful at 
that level can’t be prescriptive. It has to 
be a little more flexible. I think that’s 
where we’re at in the industry, trying to 
figure out how to do that.” 

There was a time, Bjork said, when 
you could sell agile tools to teams. But 
today, he cautioned, you better have a 
message for stakeholders. “If you think 
about our tooling, in the last release, in 
2012, our focus was on making an agile 
team successful,” he said. 

“Scrum was the pattern we followed, 
and we built tooling to support that. We 
built sprint-planning tools, we built 
taskboards, we built backlog-manage¬ 
ment tools. That was our focus and 
that’s where we put our energy. As we 
move forward, our energy is now 
around the assumption that you’ve got 
lots of agile teams using that tool set, 
and now how do you make sense of it at 


the stakeholder level, at the leadership 
level? And that’s what, internally, we’ve 
been calling enterprise agile. How do 
you do this inside an enterprise, not just 
at a team level?” 

Getting the enterprise involved 

The struggle to scale agile through an 
enterprise is “a genuine issue that 
needs to be resolved,” said Mark 
Richter, solution architect at Thought- 
Works. That company’s approach is to 
let each department use the tools they 
have and are comfortable with. That, he 
said, will result in agile development. 

ThoughtWorks helps developers 
using Microsoft technologies become 
more agile by letting them pick and 
choose the right tool for the job. “It’s 
important that developers not be tied 
to a particular vendor for everything 
they do,” said Richter. “Microsoft likes 
you to be wired into their tool set, but 
we allow you to use the best tool for 
the job, especially with build and 
deployment.” 


The company ties its Mingle agile 
project-management platform into 
Microsoft’s Visual Studio, so program¬ 
mers can run Mingle inside the devel¬ 
opment environment and query the 
work that’s on their plate for today, 
without having to jump out of Visual 
Studio to a Web connection all the 
time, Richter explained. Developers 
can create their “Mingle cards” right in 
Visual Studio, he said, and an Excel 
plug-in enables managers to query the 
cards and import the user stories into 
Excel. This helps them track who’s 
working on what, what the assignments 
are for the day, and what kind of 
progress the developers are making. 

Richter also said ThoughtWorks 
offers a .NET client library for Mingle, 
through which Mingle data can be 
pulled down to a workstation client. 
The Visual Studio and Excel plug-ins, 
as well as the .NET client library, are all 
open-source projects hosted on 
GitHub, he said. 

continued on page 41 ► 
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◄ continued from page 38 

Further, Thought Works’ Go tool for 
continuous integration integrates with 
Microsoft’s Team Foundation Server, 
enabling developers to pull source code 
into Go to do automated builds. 

Go, Richter said, has the ability 
to orchestrate very rich pipelines, 
and “visualize the dependencies 
and flow through whatever 
pipeline infrastructure you’ve 
created to deliver stuff. That can 
be across several browsers while 
running thousands of tests in par¬ 
allel.” 

Tim Miller, CEO of agile planning 
software provider Rally Software, also 
sees organizations struggling to scale 
agile at the enterprise level, but he 
thought a lack of organizational visibili¬ 
ty is only part of the problem. The big 
problem, he believed, is cultural. 

“Individual practitioners have always 
been extremely excited by agile. Teams 
have gravitated to it naturally,” Miller 
said. “Rut these techniques can be 
counterintuitive to executives.” 

businesses have long required long¬ 


Portfolio backlogs: Microsoft's approach to stakeholder engagement 

The approach Microsoft has taken to engage stakeholders is to create in Team Foun¬ 
dation Server what it calls portfolio backlogs. "Essentially, you've got teams down 
here, and they've got their own backlogs, and they have a fair amount of autonomy 

with that backlog, but up above that, the 
level of detail that's necessary to be suc¬ 
cessful with an agile team doesn't scale 
to what a stakeholder needs," explained 
Team Foundation Server's Aaron Bjork. 

"So we've created what we call port¬ 
folio backlogs, and we allow you to link 
these things together so you can have a 
coarse-grained view that makes sense 
for the level that you need. Our tooling 
out of the box supports two levels-several teams and a leadership team over this— 
and you can scale it up to this. And every one of these levels support the same prin¬ 
ciples: It supports a backlog you can order, it supports a Kanban board, and it 
also...allows you to customize it to meet the conversation that's happening at that 
level, if you will." I 

—David Rubinstein 



term planning efforts to be successful, 
and according to Miller, “The people 
who write these heavy plans want to 
apply them everywhere, when in fact 
less planning means better software, 
sooner.” 

He said it takes time to change a cul¬ 


ture—perhaps years—so it’s important 
that businesses have the ability to pivot 
rather than stick to a plan that isn’t 
working. It’s true for development 
teams, he said, and it’s true for the 
organization at large. 

continued on page 42 ► 
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Scaling agile still starts with team 
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TFS Team Room allows developers to communicate on projects as they're being 
worked on. 


BY DAVID RUBINSTEIN 

While scaling agile wide and high is the goal 
for many organizations, Aaron Bjork says it 
still begins with the team. 

“We’re making a concerted effort to be 
attractive to teams,” said Microsoft’s Bjork, 
principal program manager lead working on 
Team Foundation Server in the Cloud Dev 
Services department. “I hear a lot of people 
talk about agile, and they talk about agile prin¬ 
ciples, and I get it. I mean agile is all about 
learning and applying that learning. But I think 
we’re really doubling down on agile teams and 
making agile teams successful. I think it starts 
there. This stakeholder stuff is built from the 
bottom up. It’s got a strong foundation in teams 
being successful. It’s not a top-down, and I 
think a lot of people get that backwards when they want to 
make an agile transition or transformation.” 

So how can managers be most effective? “I actually talk 
about un-managing the process. People ask, ‘How do I 
manage this process?’ And I tell them you’ve got to start by 


getting out of the way and stop telling everybody what to 
do. And you’ve got to talk to your team to find out what they 
need to be successful. And if you can that, start there and 
then build out what you need. I think we’re trying to roll 
that into some of our thinking.” I 


◄ continued from page 41 

Yet Rally, at its core, is still a project 
and program-management application. 
“We tend to stay above development 
tools,” Miller said. “We have 48 integra¬ 
tions across the development life cycle. 
We’re the layer on top that helps 
orchestrate the software process.” 

So Rally sees its integration with 
Team Foundation Server as just anoth¬ 
er one of its integrations, providing the 
same functionality to Microsoft shops 
that it offers to shops of other types. 
“Our strategy,” Miller said, “has been to 
be somewhat neutral.” 

Dealing with disappointment 

Joel Semeniuk, executive vice president 
of innovation at Telerik, agreed that 
agile uptake is reaching a maturity 
stage, but he also said he sees a “trough 
of disillusionment” that has been creat¬ 
ed by agile’s failure to live up to the 
hype. “We’re still seeing pockets of 
agile. It’s hard to scale agile sideways as 
well as up,” he said. 

He also agreed with the manifesto’s 
directive that individuals and interac¬ 
tions are more important than tools. 
“Tools aren’t agile or non-agile. We use 


tools to become agile. Team Founda¬ 
tion Server allowed agile to grow” in the 
Microsoft ecosystem, he said. “It’s how 
the team used the tool” that made them 
agile, he added. 

Telerik integrated its Team Pulse 
agile project-management tool with 
Team Foundation Server, allowing 
managers to track time against work 
done in TFS, Semeniuk said. Further, 
he said Telerik provides a feedback 
portal for users to make feature 
requests, vote on features or bug fixes, 
and provide feedback to back-end 
developers in near real time. 

The company’s Icenium develop¬ 
ment environment empowers develop¬ 
ers to create mobile applications using 
HTML5 and JavaScript. Although 
these are not traditionally seen as 
Microsoft developer skills, Semeniuk 
said “Microsoft shops are approaching 
mobile development from a non- 
Microsoft perspective. It’s part of the 
growing BYOD trend. Being able to 
write mobile apps using standards like 
HTML5 and JavaScript is extremely 
agile.” 

Semeniuk went on to say that in 
Telerik’s customers, as well as in 


Microsoft, he’s seeing an embrace of 
agile. “Microsoft believes agile is the 
right way of doing things,” he said. And 
going forward, he expected to see a col¬ 
lapsing of DevOps into these agile 
processes. “Those [DevOps] teams are 
great agile teams.” 

VersionOne’s CTO Ian Culling said 
this expansion into DevOps is part of 
the continuous-improvement mindset 
that agile organizations must have. “So 
many large enterprises are interested in 
agile at scale from the get-go,” he said. 
“They’re ready to pull the trigger on a 
significant transition, where in the IT 
group or even more broadly.” 

But Culling said many organizations 
that have implemented so much open- 
source infrastructure around continu¬ 
ous integration, test automation and 
source control are struggling. “More 
progressive teams are moving away 
from the full single suite [of tools] to 
more of a best-of-breed approach,” he 
said. “Large enterprises that continue 
to try to push a single suite on their 
teams, that’ll erode pretty quickly. Agile 
wants to give people autonomy” over 
the tools they use, he said. I 

□ Find this story at http://sdt.bz/60818 
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Enterprises see 
tradeoff of lower 
costs, greater 
flexibility versus less 
control over the 
infrastructure 

BY ALEX HANDY 

W hen Platform-as-a-Service 
began to take shape, back in 
the days when Ruby on Rails 
was still new, the primary benefactors 
were startups. These small firms with 
no capital to spend on servers found 
PaaSes to be a great way to save money 
and still host quality applications reli¬ 
ably Now that PaaS is for more than 
just Ruby, however, enterprises have 
been evaluating the value of both pub¬ 
lic and private offerings. And while 
both public and private PaaSes have 
their benefits, it seems as though only 
one side of this new conflict is really 
fighting at all. 

Sinclair Schuller, CEO of PaaS 
provider Apprenda, said that the origins 
of public PaaS formed the views of 
what PaaS should be in the minds of 
enterprises. “You look at public PaaSes 
like Heroku and Engine Yard; their pri¬ 
mary demographic was the independ¬ 
ent developer and the Web 2.0 start¬ 
ups,” he said. 

“For that class of developer and 
company, the outsourced value of PaaS 
is very high. They don’t want to build a 
server farm, so for them they reap the 
benefit of not outlaying the capital 
investments for these resources. That’s 
very different from the enterprise, 
where they have all these constraints 
they’re working with. What happened 
is, the Herokus and CloudRees [are 
talking] more to the enterprise, and 
they’re taking their model and pushing 
it to the enterprise. That’s why they 
have a chip on their shoulder.” 

In other words, because public PaaS 
continued on page 46 ► 
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providers are having to now pivot their 
offerings into enterprise-style services, 
they’re getting frustrated and framing 
the public-versus-private discussion as 
a kind of religious war, said Schuller. 

Apprenda and other private PaaS 
providers push the idea that enterprises 
cannot simply use public PaaSes due to 


data storage regulations, as well as the 
fact that enterprises can derive value 
from building their own clouds, expen¬ 
sive as they may be. For this reason, 
private PaaS providers typically pitch 
their products as a way to bridge the 
gap between public and private clouds. 

Using Apprenda, for example, 
Schuller said developers can provision 
systems both internally and in Amazon 
EC2 from the same control panel. For 
this reason, Apprenda and other private 
PaaS providers tend not to view the 
public/private conflict as a conflict at 
all. 

Abstracting difficulty 

That’s not to say there are no benefits 
for enterprises in the public cloud. 
Indeed, Heroku was initially created to 
solve the major headaches of hosting a 
Ruby on Rails application. 

Matthew Soldo, senior director of 
product management at Heroku, said 
that there are benefits in the public 
cloud that simply can’t be matched at 
any enterprise. One such example was 
the recent discovery of major security 
vulnerabilities in Ruby on Rails. Instead 
of telling its customers to patch their 
software, he said Heroku was able to 
automate its patching across all cus¬ 
tomers, before the vulnerability 
became too widespread. 

“We run a lot of Rails code internal¬ 
ly,” said Soldo. “One thing that’s power¬ 


ful is that each time you push your code 
to Heroku, we are keeping track of the 
metadata in terms of what gems are 
installed on an application. We have 
that in a relational database, and we 
were able to look at our own applica¬ 
tions that were affected, but also in our 
customers’ applications.” 

Another major benefit of public 


PaaS is abstracting database concerns 
away from developers. While running 
and controlling the database is typically 
a very core skill for any enterprise, it 
can cause developers to have to slow 
down and wait for the DR As to catch 
up during a development process. 

Heroku long ago chose PostgreSQL 
as its public cloud database offering. 
Using Heroku PostgreSQL, developers 
can automate almost all typical data¬ 
base tasks, such as sharding, backup, 
schema modifications and automatic 
scaling—all non-trivial database tasks. 

Rut Heroku is not the only company 
offering an enterprise-quality PaaS 
while still claiming that the private 
cloud eliminates the actual benefits of a 
cloud. CloudRees, a Java-focused PaaS 
company, also believes private clouds 
are missing the point. 

Sacha Labourey, CEO and founder 
of CloudRees, has long kept his public 
PaaS company focused on developer 
needs. To that end, it ties in a life cycle 
for Java, featuring built-in build and 
deploy tools. He pointed to private 
PaaS companies, saying they are simply 
trying to bring an old software model to 
a new market. 

“I think many of those companies 
and vendors took the decision to focus 
on their strengths and sell on-premise, 
rather than try a new model in the pub¬ 
lic cloud. That’s what we’ve seen with 
Red Hat and VMware and IBM and 



'The Herokus and CloudBees are 
talking more to the enterprise, and 
they're pushing their model to the 
enterprise. That's why they have a 
y chip on their shoulder.' 

—Sinclair Schuller, Apprenda 


Oracle,” said Labourey. 

He said that, while the idea behind 
private PaaS is to save enterprises mon¬ 
ey on hosted compute resources, “We 
have yet to see any return on invest¬ 
ment on a private PaaS deployment. 
You’re telling customers they have to 
re-implement the cloud. You have to 
train new teams on how to manage that 
and expect some type of efficiency. I’m 
truly skeptical on the output of those 
strategies, but at the same time I under¬ 
stand terribly well how that happens.” 

Fundamental uses 

One thing all PaaS players can agree on 
is that PaaS benefits developers. The 
entire point of a PaaS is to ease the 
deployment process, and to create a 
standardized environment into which 
to place applications before virtualiza¬ 
tion. In this regard, it’s also a boon to IT 
administrators. 

Red Hat is aiming to enhance the 
productivity and velocity of both IT 
administrators and developers with 
OpenShift, its Enterprise Linux-based 
PaaS offering. Ashesh Badani, global 
leader for cloud at Red Hat, said that 
basing OpenShift on Red Hat allows IT 
administrators to use existing skillsets 
to manage their new PaaS. 

“What was valuable was that every 
Linux administrator can become a 
cloud administrator,” he said. “On the 
flip side, the developers and DevOps, 
from their perspective, this is fantastic 
because I can go to a public cloud envi¬ 
ronment, and then when I bring it into 
my shop, I can leverage the same tech¬ 
nology, right? It’s the same in the public 
cloud, and in-house.” 

Badani said that hybrid public/pri¬ 
vate clouds are extremely desirable to 
enterprises, and being able to provision 
internally and externally is useful for 
businesses that need unexpected bursts 
of IT capacity. 

New kid on the block 

This April, VMware’s former CEO 
finally unveiled his new project to the 
public. Pivotal is a pseudo-startup 
formed from the combined staffs of 
VMware and EMC, spun off as a sepa- 
continued on page 48 ► 
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‘We're seeing a new 
generation of data 
fabrics. When data 
fabrics change, 
everything else 
changes as well .' 

—Paul Maritz , Pivotal 


◄ continued from page 46 

rate, autonomous company with Paul 
Maritz as CEO. The goal of Pivotal, it 
was revealed, is to build the next gener¬ 
ation of platform needed for enterprise 
software development and deployment. 

“The short reason for Pivotal’s exis¬ 
tence is that we believe there’s a need 
for a new platform for a new era,” said 
Maritz at the launch of Pivotal. “We’re 
going to set out on a journey to build 
that platform and enable new and excit¬ 
ing things to be done.” 

Maritz detailed a future he envi¬ 
sioned which will be created by “appli¬ 
cations that cant be done on existing 
substructures. They will require new 
data fabrics and a cloud-enabled data 
center.” 

He cited, as examples of this new 
wave in technology, “The Googles, 
Facebooks and Amazons. If you look at 
the way they do IT, it’s significantly dif¬ 
ferent from how enterprises do IT 
today. These Internet giants have been 
able to store very large amounts of data, 
reason over it, and store it with a low 
cost. We’re seeing the emergence of a 
new generation of data fabrics. History 
teaches us that when the data fabrics 
change, just about everything else 
changes as well.” 

Finally, Maritz cited the enablement 


of consumer-grade technologies, not 
enterprise-grade software, as the end 
goal of Pivotal. “We need to bring con¬ 
sumer-grade capabilities back to the 
enterprise. The gold standard today is 
to be found in the consumer world, and 
we need to bring that back to the enter¬ 
prise world,” he said. 

Pivotal is still working to create its 
next-generation platform from software 
included in VMware’s portfolio, such as 
Cloud Foundry, GemFire and Spring. 

Entrenched players 

Public PaaS advocates aren’t worrying 
however, said ActiveState’s CEO Bart 
Copeland. ActiveState’s private PaaS, 
Stackato, can bridge the gap to public 
clouds. “The Herokus and CloudBees 
of the world say, ‘We’ll take care of it all 
for you,’ but enterprises want control. 
Sometimes you want to limit what the 


service can or cannot do, for example. 
We’re giving them the T’ in Platform- 
as-a-Service so they can run the servers 
themselves,” he said. 

Copeland said enterprises have an 
innate need to twiddle all the knobs and 
dials on their software and IT infra¬ 
structure. Without that control, com¬ 
plex enterprise applications may not 
work with a one-size-fits-all PaaS. 

“Heroku is giving you the whole 
thing in a box to make it really easy to 
deploy an application. We’re doing the 
same thing, but we don’t provide the 
IaaS layer. If you think about it, what 
Heroku does appeals very much to the 
developer: They want to get stuff out 
really quickly. The problem is if a devel¬ 
oper gets something really quickly done 
on Heroku and deploys it, the enter¬ 
prise says ‘You can’t do that, it needs to 
be under our firewall.’ So the develop¬ 
ers say ‘Give me Heroku,’” said 
Copeland. 

That’s something Apprenda’s 
Schuller has seen as well. He said that 
Heroku and other public PaaS 
providers have taught the marketplace 
just what a PaaS should feel like. Unfor¬ 
tunately for them, he said, developers 
are all saying, “ ‘Wouldn’t it be nice if I 
had Heroku at work?’ That created this 
demand internally.” 

Said Copeland, “One thing we’re 
seeing is that the developers just want 
things to go quickly. So they’re working 
on AWS because they like it, but then 
they’re not allowed to put that into pro¬ 
duction. How do you create a common 
platform so they can play on AWS and 
they can do it quickly, but when they’re 
ready to go to production, we redeploy 
and retarget the application? That’s 
something we’re definitely seeing.” I 
□ Find this story at http://sdt.bz/60786 


Red Hat's OpenShift: Public AND Private 

Public or private, Red Hat is betting on its OpenShift platform to be what brings devel¬ 
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Developers find that 
getting out of their chairs 
improves concentration 
and health, and — bonus! — 
makes them more productive 
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BY LISA MORGAN 


R eflecting a wider workplace trend, 
more software engineers are replac¬ 
ing their traditional desks with stand¬ 
ing desks, walking desks or adjustable desks 
in hopes of becoming healthier—or at least 
more comfortable throughout the day. It 
turns out they are also becoming more pro¬ 
ductive. 

Take Fullpower Technologies CEO and 
serial entrepreneur Philippe Kahn. While oth¬ 
er CEOs prefer the comfort of leather execu¬ 
tive chairs and cherrywood desks, Kahn 
prefers keeping his body and his mind in 
motion using a walking desk that features a 
built-in treadmill. 

He has always integrated some sort of exer¬ 
cise into his workday (jogging, biking or sail¬ 
ing) to improve his health and professional 
effectiveness. Apparently his approach gave 
him an edge, if founding four successful com¬ 
panies, inventing the camera phone, or com¬ 
ing up with medical technology serve as any 
indications. About 18 months ago, he added a 
walking desk to his routine so he could work 
and exercise at the same time. There are now 
10 such desks at Fullpower. 

“I find that being more active always 
makes me more creative,” said Kahn. “Sitting 
in a rolling chair, eating Doritos and gulping 
Red Bull all day doesn’t make me more cre¬ 
ative even if at first I think that it does.” 

While other software team leaders agree 
that a vertical physical orientation yields 
health and productivity benefits, most are 
stationary while they code or design soft¬ 
ware, not walking. First, walking desks are 
considerably more expensive than standing 
desks (a couple of thousand dollars versus a 
few hundred dollars or less). Second, walk¬ 
ing desks require more space because the 
desk is perpendicular to the treadmill. Final¬ 
ly, the idea of walking and coding simultane¬ 
ously represents more stimuli than the aver¬ 
age developers brain wants to handle. Even 
developers who stand while they code say 
walking would be too distracting. 

Kahn, who still codes, disagrees. “I regu¬ 
larly walk 20,000+ steps without even notic¬ 
ing,” he said. “It’s really a natural thing to do.” 

The options 

Standing desks, walking desks and adjustable 
desks are commercially available, although 
some teams are cobbling them together 


using whatever components are available in 
the office. Kahn’s team builds their own 
walking desks or buys them from TreadDesk. 

“We all spend too much time sitting,” 
Kahn said. “[The walking desk] is a way to 
make a lifestyle change and become more 
productive. Suddenly we can be in motion 
without leaving our desks.” 

While the average person walks about 3 
mph, those using walking desks tend to 
move considerably slower. Kahn recom¬ 
mended starting at a speed of about 0.5 
mph, for example. 

ISV SaltStack and mobile platform 
provider Point Inside use standing desks 
built with components from Ikea. SaltStack 
CTO Tom Hatch said his developers stand 
about two-thirds of the day because the 
combination of standing and sitting makes 
them feel better physically, and they are able 
to concentrate for longer periods of time. 

Bill Johnson, location manager at Point 
Inside, said his post-lunch energy dip van¬ 
ished. 

“I used to drink a sugared drink to com¬ 
pensate, but that’s not the case anymore,” he 
said. “I’m more focused, I can concentrate 
longer, and my productivity has improved.” 

Cloud storage and collaboration solution 
provider Box uses electrically adjustable 
desks that raise and lower at the press of a 
button. In the beginning it made makeshift 
standing desks out of cardboard boxes, but 
now there are 700 to 800 adjustable desks at 
the company, and 60% to 70% of the engi¬ 
neers are using them at any given time. 

“I can concentrate longer if I can change 
the mode from sitting to standing,” said 
Michael Smith, CTO of Box. “You’re more 
likely to be productive if you can change 
your environment.” 

Pavel Grigorenko, a research engineer at 
ZeroTurnaround, said seven of the 20 engi¬ 
neers at his Estonia office have electrically 
adjustable desks, but everyone is clamoring 
for them. At the office in America, the com¬ 
pany’s CEO, as well as the sales and market¬ 
ing departments, are also standing while 
they work. 

“I’ve been using the adjustable desk for 
three or four months, and I can’t imagine sit¬ 
ting all day now,” said Grigorenko. “I sit in a 
high chair about 30% of the time because 

continued on page 54 ► 
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standing all the time isn’t good for you 
either. I have a whiteboard behind me 
so I can just turn and draw when I’m 
standing, which makes me more pro¬ 
ductive. If I were sitting all day, I’d be 
too lazy to get up.” 

Creating a modern workplace 

Companies are focusing more on well¬ 
ness because it positively affects their 
bottom line. For one thing, healthier 
employees tend to have fewer insur¬ 
ance claims. 

“More companies are investing in 
keeping their employees healthy 
because there are going to be fewer lost 
days,” said Carol Stuart-Buttle, princi¬ 
pal at Stuart-Buttle Ergonomics. 
“Standing is better than sitting, but 
standing all day is unusual.” 

Approximately 78% of the population 
will experience back problems at some 
point in time, she said. To avoid that and 
other problems, the height of a desk— 


lOgen engineers at work. 

whether the user is sitting or standing— 
should be adjusted such that the user 
can easily rest his or her forearms on the 
desk rather than reaching forward. 

“When you use a mouse, you have to 
be precise,” said Stuart-Buttle. “If your 
forearm isn’t properly supported, it is 
being stressed.” 


Sitting all day stresses the lower 
back and decreases metabolism, while 
standing all day can cause varicose 
veins. Therefore, the best option is an 
adjustable desk that allows the user to 
sit or stand at will, Stuart-Buttle said. 
Commercially available desks come in 
spring models, crank models and elec- 


Build your own ergonomic standing desk 


If you're going to cobble together your own standing desk, it's 
important to get the ergonomics right. If you fail to create an 
ergonomically sound work environment, you may be less com¬ 
fortable and less productive than you could be. Carol Stuart-But¬ 
tle, principal of Stuart-Buttle Ergonomics and a leading certified 
ergonomist, offered the following tips: 

■ Opt for an "adjustable" desk that allows you to stand and sit. 

Standing all time can cause varicose veins; sitting all the time 
strains the lower back and decreases metabolism. An articulat¬ 
ing monitor arm will allow you to easily adjust the height of a 
monitor. You can use a secondary surface on top of your desk to 
raise the height of your keyboard and mouse accordingly. 

■ Make sure the desk height is right whether you f re standing or 
sitting. You should be able to rest your forearms comfortably on 
the desk, giving you ample support for using a mouse. If you have 
to reach forward, you are straining the neck and shoulder muscles. 

■ Make sure the desk is stable. Using a mouse effectively 
reguires precise movement. You need good support to rest your 
forearm, keeping your wrist as straight as possible, to effectively 
and safely use the mouse without stressing the arm. 

■ If your setup is not adjustable, consider having two setups: One 
for standing and one for sitting. That way, you won't have to 
move your monitor(s), keyboard and mouse. (This works well if 
you have an L-desk.) 

■ You should not have to lean forward to see your monitor. When 
the monitor is at the wrong distance, it can cause a person to lean 
forward. This is one of the leading causes for physical discomfort 
of the back, shoulders and neck. Make sure the monitor is posi¬ 
tioned close enough so that you don't have to lean forward to see 
it. When sitting back in a chair with your arms on the armrests, the 


monitor is often forward on the desk. This angle changes when 
standing. You may need to push the monitor back so that you can 
stand close to the desk and support your forearms. 

■ You should not be resting your weight on your wrists. Leaning 
forward causes the user to bear weight on his or her wrists, caus¬ 
ing stress. Prolonged bent wrists or pressure on the wrist can 
cause tendinitis or lead to carpal tunnel syndrome. Good forearm 
support and a straight-wrist posture are important. 

■ Once you f re set up, mix it up. Standing has its benefits, but it's 

important to ensure that you're not causing stress to another 
part of your body. Ideally you should mix sitting and standing 
throughout your workday. I — Lisa Morgan 
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Sage advice 

You can buy a ready-made desk that's built for standing, or you can buy an 
adjustable desk for both sitting and standing. If you buy a ready-made prod¬ 
uct, it's going to cost more than something you cobble together. If budget is 
a problem, set up some cardboard boxes and a piece of wood. It's not 
ergonomically sound, but you'll quickly discover whether standing is for you. 

When given the choice of a standing or adjustable desk, opt for the latter 
if budget allows. Most people who stand do so 60% to 70% of the day. If the 
desk adjusts with a button or a crank, it is easier to transition from one posi¬ 
tion to another than if the user has to move furniture and equipment. 

Finally, be patient. Any change requires adjustment. While the first days 
or weeks may be uncomfortable, with practice and the right ergonomics, 
new users will likely find they feel better, have more energy, and are able to 
maintain higher levels of concentration throughout the day. I 

—Lisa Morgan 



St 


tronically adjustable models. Some do- 
it-yourself configurations also allow the 
user to stand or sit, although the shift in 
orientation may require the user to 
move around equipment and furniture 
components. 

While standing is better from an 
ergonomic standpoint, some of the fur¬ 
niture might not be as friendly from an 
environmental standpoint. Andrew 
Erlichson, VP of education and cloud 
services at open-source database com¬ 
pany lOgen, said the noise generated 
by electrically adjustable desks is a 
drawback. 

“We work in a bullpen where you 
want to be quiet,” said Erlichson. “We 
got a few adjustable desks that sounded 
like a hospital bed, and it took 30 to 40 
seconds to get the desk from one posi¬ 
tion to another.” 

While it is possible to buy or create 
an ergonomically sound standing desk, 
the easier it is to adjust, the more likely 
people are to use it, Stuart-Buttle said. 

“The ability to hit a button and 
adjust the height is huge,” said Luke 
Galen, CTO of Precision Nutrition. “If 
you can get into the zone, its easier to 
focus on coding, and at the end of the 
day you feel good.” 

Precision Nutritions tech group uses 
eight standing desks and two walking 
desks. (One person still sits.) Galen said 
he stands 60% of the day. His electroni¬ 
cally adjustable desk accommodates 
three 24-inch displays and a laptop. 
Atypically, his company gives each 
employee a US$10,000 annual equip¬ 


ment budget and encourages them to 
spend some on a standing desk. 

“If you’re going to spend twice as 
much on a desk, it has to be a good 
investment,” said Galen. “The costs of 
our desks have been recouped by pro¬ 
ductivity increases alone, and that 
doesn’t even factor in the health bene¬ 
fits.” 

“Labor is the most costly expense,” 
said Allan Branch, cofounder of 
LessAccounting.com. “Even 15 min¬ 
utes more productivity is significant. If 
you’re saving two to three hours a 
month, it pays for itself.” 

What changes when developers stand 

Most people interviewed for this arti¬ 
cle said that standing affected their 
overall productivity, their energy level, 
their ability to concentrate and their 
interaction with others. Most also said 
their initial experience changed as 
time went on. 

“It does take some time to get used 
to. In my particular case, I now have a 
hard time focusing when I sit down,” 
said Lullpower’s Kahn. “Initially the 
perception is that it is more difficult to 
focus, but after a couple of weeks it’s 
hard to go back to the old ways.” 

Ace Bhattacharjya, founder of Med- 
icalRecords.com, said the first few 
months were painful, although others 
said they adapted to standing in a cou¬ 
ple of days or weeks. One way to lessen 
fatigue is to buy more-comfortable 
shoes or a gel mat, although some say 
the gel mats can get in the way of the 


chairs they use 30% to 40% of the day. 

Standing may also adversely affect 
concentration at first, although those 
standing and sitting claim that their abil¬ 
ity to change positions helps maintain 
concentration throughout the workday. 

Interestingly, what people do while 
they sit or stand varies. Lor example, 
Branch stands when he codes and sits 
when he designs. Charitybuzz.com 
CTO S ameer Chowdhury sits when he 
codes or pair-programs but stands when 
he’s on the phone. When someone is 
concentrating on something that 
requires significant brainpower, such as 
an algorithm, most interviewed for this 
story preferred to sit. 

Standing also may change team 
dynamics. Lor example, Bhattacharjya 
said his team members are more 
inclined to get to the point when he’s 
standing. Others said that because they 
are already standing, they are more 
inclined to move around, which may 
mean pacing or walking over to a team 
member’s workspace. 

While standing is becoming part of 
the culture in some software teams and 
companies, few are going to the 
extremes Box does. When Smith’s team 
moved to its new office, it had contests 
to see who could stand the longest. The 
company also used Litbit pedometers 
to measure how much walking or exer¬ 
cising people were doing. 

“We had a contest to see which 
department exercised the most,” said 
Smith. “It fosters a spirit of wellness.” I 
□ Find this story at http://sdt.bz/60799 




REGISTER NOW 



CHICAGO, IL 


oin Us for the Third Annual 


SEPTEMBER 


McCormick Place 


SECURITY CONGRESS 



Empowering CyberSecurity Leaders 

The (ISC) 2 Security Congress event offers invaluable education to all levels 
of cyber security professionals.This event empowers professionals with the 
latest trends, emerging issues and tools to strengthen their security without 
restricting their business while providing networking opportunities with 
industry leaders. Colocated with the ASIS 201 3 59th Annual Seminar and 
Exhibits, (ISC) 2 and ASIS International have teamed up to bring you the 
largest security conference in the world. Don’t miss out on this opportunity 
to network and learn from other cyber security professionals from all over 
the world, plus earn up to 44 CPEs! 

www.isc2.orp/conpress201 3 


ASIS 

2013 











I www.sdtimes.com | July 2013 | SD Times | 


COLUMNS 1 57 


Code Watch 

BY LARRY O'BRIEN 


Architecture's Big Ball of Mud 


S oftware architecture—the highest-level com¬ 
ponents of your application and the way they 
communicate—is a crucial part of your enterprise 
application. Small, standalone applications may be 
able to survive a lack of high-level organization and 
a philosophy of structure. But since another com¬ 
mon definition of software architecture is “the stuff 
that’s hard to change,” it’s foolish to ever embark on 
a project without paying attention to architecture. 

Discussions of software architecture have 
become less common as the industry has embraced 
lean processes and code-first approaches. This has 
furthered the most common architecture: The “Big 
Ball of Mud.” Big Ball of Mud systems are charac¬ 
terized by high-level units of arbitrary and inconsis¬ 
tent size. Sometimes everything is just in one big 
directory structure, but more commonly today, a 
Big Ball of Mud consists of source code that is 
spackled over some potentially useful infrastructure. 

The source code, though, is wildly interdepend¬ 
ent and filled with assumptions about distant con¬ 
cerns. If you’ve ever worked on a system where it 
took two weeks before you dared make the first 
check-in, or where you had to talk with someone in 
another office before fixing a clearly-doesn’t-do- 
what-it-claims defect, you’ve worked with a Big 
Ball of Mud. 

Older readers may remember an analysis concept 
called “Perfect Technology Assumption.” This con¬ 
cept held that, when sketching out the scope of a 
project, one ought to assume infinitely fast proces¬ 
sors and capacity, foolproof messaging, and whatever 
input and output technology one wanted. Of course, 
one didn’t really believe in absurdities such as multi¬ 
gigahertz processors, 300-DPI displays, and voice 
input, but the idea was that one needed to avoid, in 
high-level discussions, linking the system to the lim¬ 
itations of technologies as they are. 

“Perfect Technology” became counter-produc¬ 
tive with the rise of dominant operating systems 
with extensive APIs and, more recently, the advent 
of “opinionated frameworks” such as Ruby on Rails. 
For the past two decades, software design has start¬ 
ed with an assumption of a target operating envi¬ 
ronment. That operating environment generally 
restricts the choice of programming language and 
often strongly suggests an entire set of choices 
regarding physical architecture, storage technology, 


and even component structure and design. 

Technology stacks provide so much prepackaged 
functionality—and opinionated frameworks 
embody so much good thinking—that there’s no 
question that the “Perfect Technology Assumption” 
is now the wrong approach. However, Perfect Tech¬ 
nology did have the benefit of requiring developers 
to consciously address software architecture. 

Software architecture is not the same as the 
technology stack, your frameworks, and their accu¬ 
mulated decisions. Those things may constrain or 
facilitate the architecture of your application, but 
it’s incorrect to just rattle off a list of technologies 
or show the diagram that illustrates the architec¬ 
ture of the framework. Your application’s architec¬ 
ture consists of your decisions about the highest- 
level parts of the application and how they interact. 

It’s true that if you use a framework and your 
application is small enough, you 
might avoid (or at least hide) the 
charge of having developed the 
dreaded Big Ball of Mud. But 
most SD Times readers are in the 
business of enterprise applica¬ 
tions, and those developing 
mobile applications are facing a 
market where iOS is no longer on the majority of 
devices and Android is maddeningly fragmented. 

Martin Fowler’s “Patterns of Enterprise Applica¬ 
tion Architecture” remains an indispensable text for 
enterprise architects. Its greatest strength is its clear 
discussion of layered architectures in systems that 
use a relational database for storage. This still 
describes the large majority of enterprise systems. 
However, the past decade has seen an increase in 
the sophistication of service-oriented architectures, 
a renewed interest in alternatives to relational stor¬ 
age, and the shift toward a mobile “post-PC” world. 

Other than Fowler, the most valuable texts on 
architecture are the three volumes in the “Pattern- 
Oriented Software Architecture” series. Despite 
their age, they throw some of the benefits and draw¬ 
backs of the patterns into starker relief. For instance, 
I’m not sure that I’ll see a 21st-century enterprise 
using the Unix-style “Pipes and Filters” architecture, 
but I’m positive that knowledge of “Pipes and Fil¬ 
ters” will help clarify an issue. And, hopefully, keep 
some system from being a Big Ball of Mud. I 


Larry O’Brien is a 
developer evangelist/ 
advocate forXamarin. 
Read his blog at 
www.knowing.net. 


Big Ball of Mud systems are 
characterized by high-level 
units of arbitrary and 
inconsistent size. 


□ Find this story at 
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A Analyst View 

flfe® BY MICHAEL FACEMIRE 

£ ^ Is your organization ready for mobile? 


Michael Facemire is a 
senior analyst at 
Forrester Research, 
serving application 
delivery and development 
professionals. 


Mobile will force you 
to transition from back-end 
development to user- 
experience development 


□ Find this story at 
http://sdt.bz/60797 


D emand for the creation of new mobile appli¬ 
cations is increasing rapidly, but will you be 
ready to meet these new challenges? If you think 
that satisfying business demands for mobile apps 
simply means hiring a few developers and bolting a 
new front end on existing systems of record, you 
couldn’t be more wrong. 

Mobile development presents completely differ¬ 
ent challenges than any past enterprise development 
project. And these challenges aren’t limited to devel¬ 
opment teams: They will permeate all aspects of the 
software development life cycle (SDLC), enterprise 
architecture, and the methodologies used to develop 
and deliver mobile applications. 

Mobile revamps your design 

Most development organizations already have a ded¬ 
icated design team or a design center, but mobile 
makes this an absolute require¬ 
ment. Good functionality is impor¬ 
tant, but bad design can neuter 
even the best functionality. Shops 
with a great design ethos and an 
understanding of how to convey 
these designs have a major com¬ 
petitive advantage. Look no fur¬ 
ther than Infor, a vendor in the enterprise resource¬ 
planning space. Infor created and staffed a complete, 
standalone internal creative agency, focusing on the 
user experience with its software suite. 

Additionally, mobile embraces new development 
paradigms. Developers have amassed certain lan¬ 
guages, tools, skills, processes and work habits. 
Mobile will force you to focus developers on the 
transition from back-end development to user-expe¬ 
rience development, and from back-end adaptations 
to the scale and performance of the platform. 
Mobile developers must master event-driven archi¬ 
tectures and asynchronous programming models, 
and be competent with JavaScript frameworks. 

As for testing, when it comes to mobile, every¬ 
thing happens sooner and is now everyone’s respon¬ 
sibility. Since the cadence of mobile development is 
shorter, quality assurance teams don’t have time to 
wait for development to finish before starting a test, 
and developers must understand the organizational 
test infrastructure. Mobile development employs 
automated test scripts and continuous integration 


frameworks. The net effect of these measures is that 
large portions of tests are performed every time code 
is checked into the source-code repository. 

Visualize your requirements process 

Mobile’s need for increased cadence fits hand in 
glove with lean and agile methodologies. This is 
especially true for its use of common tools and 
common documentation formats between teams. 
Agile focuses on developing a minimum viable 
product, delivering usable parts of it more fre¬ 
quently and in smaller chunks, and using feedback 
loops to adapt quickly to change. 

Traditional approaches to authoring require¬ 
ments have teams of business analysts writing tomes 
of documentation and distributing them all at once 
to development teams, a recipe for disaster. A more 
agile approach uses mockup tools to create visual 
requirements that become the focal point for driv¬ 
ing all design and development decisions. Design 
tools are beginning to import visual mockups and 
generate rapid prototypes, stubbing out the service 
layer for later implementation by the development 
team. Native integration with your ALM tool will 
increase SDLC efficiency. 

You build to scale or you plan to fail 

Remember the early days of e-business: All you had 
to do was expose legacy applications via a Web inter¬ 
face and you were in business—or so you thought. 
Then 10 million customers hit your site all at once. 
Ignoring the impact of mobile on enterprise archi¬ 
tecture may be the single most glaring oversight. 

Service-oriented architectures, largely imple¬ 
mented using simple object access protocol and 
XML-RPC protocols, have served the desktop 
browser perfectly. While mobile presents new chal¬ 
lenges, Web protocols are verbose and often session- 
and state-based. Mobile clients, on the other hand, 
don’t have a consistent connection supporting long- 
running sessions. Parsing verbose data packets filled 
with content decoration (XML tags, for instance) 
places a heavy burden on the device battery. 

The mobile application development journey is 
unlike any enterprise development project you’ve 
worked on in the past. Ry anticipating pitfalls and 
adapting your IT organization accordingly, you can 
ensure a smooth mobile development process. I 
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Guest View 

BY ROHIT SETHI 

Don't focus on the OWASP Top 10 list 

H 


I ow much emphasis does your organization 
place on the Open Web Application Security 
Project (OWASP) Top 10 list? 

The 2013 Top 10 list of Web application security 
flaws was recently finalized, and it may likely 
become the de facto yardstick that many develop¬ 
ers will use to test the security of their applications. 
Even the Payment Card Industry’s data security 
standards defer to the Top 10 list. 

But there’s one problem: the list of security 
flaws doesn’t map cleanly to software security 
requirements. After all, it includes a number of 
broad categories of threats with large subsets that 
are easy to miss. 

While the Top 10 list is a useful awareness tool 
for developers, it should not be viewed as a pre¬ 
scriptive list of how to build secure software. Think 
of all the breaches in popular Web applications 
that have occurred over the past few years—specif¬ 
ically the ones that didn’t follow basic best prac¬ 
tices like hashing and salting passwords. Is it really 
possible that none of these applications were PCI 
compliant or passed the Top 10? 

And consider the recent HP 2012 Cyber Risk 
Report, which revealed these dismal findings on 
Web application security culled from static-analysis 
testing: 

• 92% were vulnerable to information leakage 
and improper error handling 
• 88% had insecure cryptographic storage 
• 86% had injection flaws 
• 75% had insecure direct object reference 
• 61% had broken authentication and session 
management 

Bear in mind that most of these vulnerabilities 
have been on the OWASP Top 10 list for years, and 
they are not at all exotic. So why are these failure 
rates still so high? 

Overly broad categories 

The Top 10 is treated like a prescriptive guide for 
secure development, but it’s simply too broad to be 
used for specific requirements. Take “sensitive data 
exposure” (A6 on the list). Unlike more specific 
concerns like “cross-site scripting” (A3), in which 
it’s quite clear what the developer needs to look for, 
“sensitive data exposure.” 

For instance, it could mean a common step such 


as encrypting confidential data during transmission 
or during storage. But it could also refer to less-com¬ 
mon tasks such as avoiding caching confidential data 
in temporary files or using unsafe cryptographic 
modes. This is the trap many organizations fall into. 

The list includes four other threat categories that 
are too general to be used as requirements: “injec¬ 
tion” (Al), “broken authentication and session man¬ 
agement” (A2), “security misconfiguration” (A5), and 
“missing function-level access control” (A7). 

Few organizations understand how to properly 
assess these open-ended categories. Instead, they 
limit their scanning to a small subset of the actual 
threats and then assume the application has passed 
that requirement. The same is true with organiza¬ 
tions that perform penetration testing to verify the 
application’s security. 



Rohit Sethi is vice presi¬ 
dent of SD Elements, a 
security consulting com¬ 
pany. He also created the 
OWASP Design Patterns 
Security Analysis project. 


Key vulnerabilities left out 

By its very nature, the Top 10 list 
doesn’t include all threats that 
might pertain to specific Web 
applications. Here are just a few 
security issues that were left out: 

• Mass assignment vulnerability 

• Clickjacking 

• Buffer overflow 

• Validating client certificate chain of trust correctly 

It’s important to keep in mind that the Top 10 
list is only designed for Web applications. Today’s 
applications might have mobile app components, 
rich clients and Web services, all of which have 
their own set of security requirements that may not 
be covered by the overly broad Top 10 categories. 

What developers should use instead 

The OWASP community is aware of the limitations 
of its Top 10 list; that’s why they created a more com¬ 
prehensive software security requirements program 
called the Application Security Verification Standard 
(ASVS) project, which is a better fit for developers. 

It’s time for developers to stop using the Top 10 
list in the wrong way. It was meant to be an aware¬ 
ness tool, not a list of requirements. Effective securi¬ 
ty requirements need to be specific so as to not be 
open to interpretation; testable, so that developers 
can be sure they’ve met the requirements; and rele¬ 
vant, so that they actually apply to the application. I 


The Top 10 list was meant to 
be an awareness tool, not 
a list of requirements. 


□ Find this story at 
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Industry Watch 

BY DAVID RUBINSTEIN 

Cleaning house 

A s I write this, people are scurrying around our 
i BZ Media offices in the village of Hunting- 
ton on Long Islands beautiful North Shore, 
preparing for a move to one of those high-rise 
office park buildings that have sprouted up along 
our major thoroughfares. 

The decisions always come down to what to 
bring, and what to leave behind. And, as I go 
through old notebooks and review press kits I 
received going back to 2000, I’m struck by how 
much the industry has changed over those years. 

Much of it is reflected in the names of compa¬ 
nies no longer around, either swallowed up years 
ago by the bigger fish in their 
ponds, or left to die along the 
shoulders of the technology 
superhighway, as newer ideas 
and innovations sped past them 
in what for me is a 13-year blur. 

We remember the companies 
we called winners back in the 
day. Rational Software (swallowed whole by IBM). 
Borland (then Inprise, then Borland again, now a 
Micro Focus subsidiary, with its development tools 
owned by Embarcadero). Telelogic (also eaten 
alive by IBM). Each had great developer tools, and 
each lives on under new leadership. 

But back in the mid- to late 1990s, when tech¬ 
nology was still so new, there were multiple com¬ 
panies vying in the IDE space, in the middleware 
space and in the data space. 

Remember Iona? Founded in 1991, they were 
hotshots back in the day, raking in money and 
spending it like drunken sailors. They once held a 
contest in which the winner would win a trip to 
Ireland (the company’s headquarters were in 
Dublin), and then see Bruce Springsteen at just 
about the height of his popularity in a concert over 
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there. Iona was in CORBA, with its Orbix product, 
then it wasn’t after CORBA was no longer the hot 
new technology. It became a SOA company with 
its Artix product, until that too was no longer the 
hot new technology. Where are they now? SOLD! 
to Progress Software in 2008. 

Then there was Curl. In the folder I created for 
Curl back in 2001, after its Surge product was 
released, I stashed a PowerPoint that explained the 
company’s vision for delivering enterprise applica¬ 
tions that required data-driven user interfaces and 
more. This would be the beginning of what was 
being called “rich Internet applications,” which, of 
course, has led us to where we are today. While 
Curl’s technology did not win, its vision certainly did. 

And no discussion of my old folders would be 
complete without bringing up WebGain. If ever a 
company seemed like it had everything going for it, 
this was the company. Founded as a joint venture 
between BE A Systems (grabbed up by Oracle) and 
the equity firm Warburg Pincus at the height of the 
dot-com bubble, it went on an acquisition spree, 
scarfing up such companies as The Object People 
(TopLink) and TogetherSoft (the UML modeling 
company), and acquiring Visual Cafe from Syman¬ 
tec. But then the bubble burst, parts were sold, and 
the company—whose PR folks were constantly on us 
to write about them—silently went away, closing in 
2002 without so much as a whimper. 

SD Times said goodbye to WebGain this way in 
its Jan. 1, 2003 issue: 

Today, we’re writing about 
companies working in the cloud, 
in NoSQL databases, in mobile 
device development. It will be 
interesting to see in 13 years 
which are thriving and which 
will end up like WebGain. I 
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Welcome! 


Dear colleague, 

Big Data is affecting all of us and may well be the future of 
computing as we know it. Companies and IT professionals who 
make the most of this new field are sure to prosper over the next 
5-10 years. Many conferences exist to trumpet the 
potential and fan the hype surrounding Big Data. 

But until now, there has been no conference that 
teaches you HOW to do it. 

Big Data TechCon is the HOW-TO conference for 
Big Data. Featuring practical tutorials and more than 
50 technical classes to choose from, Big Data Tech- 
Con is the biggest, most info-packed, most practical 
HOW-TO Big Data conference in the world. No hype. 
All tech, all the time. What else makes Big Data TechCon special? 

• Most of our speakers have been thoroughly vetted using 
evaluations from attendees at previous Big Data TechCons on 
the quality of information presented, as well as the ability to 
state it clearly and produce clear takeaways that you can apply 
in your business today. 

• Pull together your own custom conference by choosing from up 
to six classes in any given timeslot. Whether it’s a deep dive into 
Hadoop, a thorough introduction to Cassandra, or intensive 
classes on data analytics or machine learning, you put together 
the conference that works best for YOU. 

• Network with other technical IT professionals like yourself. 

Most of our attendees are software and data architects, software 
developers and engineers, data scientists, and business and data 
analysts, and there are great opportunities to talk with others 
facing the same challenges as you. Plus, there are meet-ups and 
other chances to meet and talk further with our expert speakers. 

• Great keynotes to inspire you; this conference features 
Doug Cutting, the founder of Hadoop! 

• Extra events for more networking: receptions, lunches, 
our ice cream social and the Women in Big Data Luncheon. 

• Check out cutting-edge technologies and solutions for Big Data 
in our exhibit hall and round out your three-day experience. 

Whether you are looking at dozens of terabytes or hundreds 
of petabytes, from Avro to ZooKeeper, Big Data TechCon has you 
covered! Bring two or more colleagues and save an extra $100 
each. Regardless, you will save the most off the full conference 
price if you register early. 

See you in San Francisco! 



Ted Bahr 

Conference 

Chairman 


The H0W-T0 conference for Big Data 
and IT Professionals! 

• Learn tips, tricks and techniques that will make you 
your company’s Big Data Expert! 

• Discover how to master Big Data from real-world 
practitioners—instructors who work in the trenches 
and can teach you from real-world experience 

• Hear about other related technologies that can help you 
with your Big Data projects: the cloud, efficient storage 
and warehousing methods, and more 

• Come to Big Data TechCon to master Big Data—get 
practical answers to real problems, learn tangible steps 
to real-world implementation 
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“Big Data TechCon is loaded with great networking 
opportunities and has a good mix of classes with technical 
depth, as well as overviews. It’s a good, technically-focused 
conference for developers.” 

— Kim Palko, Principal Product Manager, Red Hat 



Ted Bahr 

Conference Chairman 
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Event Schedule 


Tuesday, October 15 

7:30 am-6:30 pm 

Registration Open 

7:30 am-8:30 am 

Morning Coffee 

8:30 am - 10:00 am 

Tutorials 

10:00 am - 10:15 am 

Coffee Break 

10:15 am- 12:15 pm 

Tutorials 

12:15 pm- 1:15 pm 

Lunch Break 

1:15 pm-3:00 pm 

Tutorials 

3:00 pm-3:15 pm 

Coffee Break 

3:15 pm-5:00 pm 

Tutorials 

5:15 pm - 6:30 pm 

Lightning Talks 

Wednesday, October 16 

7:30 am-7:00 pm 

Registration Open 

7:30 am-8:30 am 

Morning Coffee 

8:30 am-9:30 am 

Technical Classes 

9:30 am-9:45 am 

Coffee Break 

9:45 am - 10:45 am 

Keynote 

11:00 am - 12:00 pm 

Technical Classes 

12:00 pm-7:00 pm 

Exhibit Hall Open 

12:15 pm - 12:45 pm 

Sponsored Classes 

12:45 pm - 1:45 pm 

Lunch Break 

12:45 pm - 1:45 pm 

Women in Big Data Luncheon 

1:45 pm-2:45 pm 

Technical Classes 

2:45 pm-3:15 pm 

Coffee, Ice Cream in Exhibit Hall 

3:15 pm-4:15 pm 

Technical Classes 

4:30 pm-5:30 pm 

Keynote 

5:30 pm-7:00 pm 

Networking Reception in Exhibit Hall 

7:15 pm-8:45 pm 

Fireside Chats 

Thursday, October 

17 

7:30 am-4:00 pm 

Registration Open 

7:30 am-8:45 am 

Morning Coffee 

8:45 am-9:45 am 

Technical Classes 

10:00 am - 11:00 am 

Keynote - Doug Cutting 

11:00 am-3:30 pm 

Exhibit Hall Open 

11:00 am - 11:30 am 

Coffee Break in Exhibit Hall 

11:30 am - 12:30 pm 

Technical Classes 

12:45 pm - 1:45 pm 

Lunch Break 

1:45 pm-2:45 pm 

Technical Classes 

2:45 pm-3:15 pm 

Coffee Break & Prizes in Exhibit Hall 

3:30 pm-4:30 pm 

Technical Classes 

4:30 pm 

Conference Closes 


Keynotes 

Thursday, October 17 

10:00 am-11:00 am 

Doug Cutting 

Founder of Hadoop 

Doug Cutting is the creator of numerous successful 
open-source projects, including Lucene, Nutch and 
Hadoop. Doug joined Cloudera in 2009 from Yahoo, 
where he was a key member of the team that built 
and deployed a production Hadoop storage and analysis cluster for mission- 
critical business analytics. Doug holds a Bachelor’s degree from Stanford 
University and sits on the Board (and is currently chairman) of the Apache 
Software Foundation. 
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Special Events 


Tuesday, October 15 

5:15 pm -6:30 pm 

Lightning Talks 

Learn something new in a handful of short, targeted talks, 
PLUS names will be drawn for free giveaways. 

Wednesday, October 16 


9:45 am -10:45 am 

Keynote 


12:00 pm -7:00 pm 

Exhibit Hall Open 

See how the Big Data ecosystem is growing and evolving by 
speaking with technical experts in our Exhibit Hall. 


12:45 pm-1:45 pm Women in Big Data Luncheon Please join us for this special event filled with delicious food, 

wonderful networking opportunities and an open forum to 
discuss what it’s like being a woman in the Big Data industry. 
All women attendees are welcome! 



2:45 pm -3:15 pm 


4:30 pm -5:30 pm 


Coffee, Ice Cream 
in the Exhibit Hall 



Keynote 


5:30 pm - 7:00 pm Networking Reception in the Exhibit Hall 


7:15 pm - 8:45 pm Fireside Chats 




Thursday, October 16 


Managing Big Data Expectations 

As if working with petabytes and zettabytes (and yottabytes!?) of information isn’t difficult 
enough, you also have to think about managing the requirements and expectations of 
management. How do you deal with an unreasonable deadline? How best to anticipate 
the right budget? What about questionable requirements that could infringe on someone’s 
privacy, which has become a sensitive subject since the development of the National 
Security Agency’s surveillance program? These difficult scenarios and more will be covered 
in the Fireside Chats hosted by some of Big Data TechCon’s expert speakers, so come ready 
with questions and your own experiences of dealing with these situations. 


10:00 am -11:00 am 

Keynote 

DOUg Cutting, Founder of Hadoop 


11:00 am -3:30 pm 

Exhibit Hall Open 

Come explore the latest in Big Data developer resources in 
our Exhibit Hall. 

11:00 am -11:30 am 

Coffee Break in Exhibit Hall 




2:45 pm-3:15 pm 


Winner’s Circle prizes announced in the Exhibit Hall 
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Tuesday, October 15 


Full-Day Tutorials 
Overview 

Hadoop: A One-Day, Hands-On Crash Course VSM EM 

Sameer Farooqui 

This full-day tutorial is a fast-paced, vendor-agnostic technical 
overview of the Hadoop landscape, and is targeted at both techni¬ 
cal and non-technical people who want to understand the emerg¬ 
ing world of Big Data, with a specific focus on Hadoop. You will be 
introduced to the core concepts of Hadoop, and dive deep into 
the critical paths of HDFS, Map/Reduce and HBase. You will also 
learn the basics of how to effectively write Pig and Hive scripts, 
and how to choose the correct use cases for Hadoop. During the 
tutorial, you will have access to an individual one-node Hadoop 
cluster in Rackspace to run through some hands-on labs for the 
five software components: HDFS, Map/Reduce, Pig, Hive and 
HBase. 

In each sub-topic, you will be provided links and resource rec¬ 
ommendations for further exploration. You will also be given a 
100-page PDF slide deck, which can be used as reference material 
after the course. PDFs will also be given out for the five short, 
hands-on labs. No prior knowledge of databases or programming 
is assumed. 

Note: You are required to bring a laptop. If you run into an 
issue during the hands-on portions, it is also not guaranteed the 
instructor will be available to help you troubleshoot. 

Level: Overview 

Intermediate 

Data Science in a Spreadsheet: Learning What’s Really 
Going on in Those Black-Box Models WM 

John Foreman 

This full-day tutorial will provide lessons on analytics practices 
every data scientist should understand. The tutorial will first in¬ 
troduce mathematical programming, then clustering and outlier 
detection (unsupervised learning), then forecasting, Monte Carlo 
simulation, and supervised AI modeling. The nitty-gritty of each 
practice will be demonstrated using spreadsheets, which you can 
download, follow along with, and keep for later reference. 

Full disclosure: These spreadsheets accompany the chapters in 
the instructor’s book “Data Smart,” which will be released around 
the same time of the conference. You will not need the book, just 
the spreadsheets. 

Level: Intermediate 


mail This icon indicates code will be shown in the session. 

Half-Day Tutorials 
Overview 

Getting Started with Cassandra EM 

Ben Coverston 

Unless you have experience with Google BigTable, HBase or 
Cassandra, column-oriented databases are probably an enigma. 
Cassandra’s data model is both simple and powerful. It takes 
some time to get used to the differences between the relational 
model and Cassandra’s column-based model. 

Cassandra is not schema-less, but we do not model relation¬ 
ships in Cassandra either. Data Modeling in Cassandra usually 
consists of finding the best way to denormalize the data when 
you put the data in the database so that you can retrieve it quickly 
and efficiently. This workshop will prepare you for success when 
modeling your data. This tutorial will dive into Cassandra from a 
developer perspective and give you the tools you need to get 
started with Cassandra today. 

This tutorial will cover: 

• An introduction to Cassandra in the context of relational 
databases and non-relational alternatives 

• Best practices for modeling your data in Cassandra 

• Cassandra Query Language (CQL version 3) 

•Wide, and Composite Columns 

• Practical Examples 

• Anti-Patterns (things to avoid) 

For a more advanced look at Cassandra, attend the “Apache 
Cassandra—A Deep Dive” class. 

Level: Overview 


It’s a great conference about newer, emerging technologies. 

— Deependra Das, Sr. Analyst, Mayo Clinic 
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Tutorials 


Tuesday, October 15 


Intermediate 

Cascading Tutorial WM EM 

Paco Nathan 

The tutorial begins with a quick pre-flight check: Set up and 
test your environment, choosing to use either laptop or cloud. 
Well cover a brief history of Cascading and related open-source 
projects (Cascalog, Scalding, etc.), plus an overview of typical use 
cases. Then well build and run the simplest-possible Cascading 
app, using it to discuss definitions of the most commonly used 
components of data pipelines. 

Well explore some of the theory which supports the use of ab¬ 
straction layers for Hadoop: deterministic vs. non-deterministic 
query planners, aspects of functional programming, pattern lan¬ 
guage, literate programming, and the software engineering con¬ 
siderations of Hadoop system integration, operationalizing apps, 
and design patterns for bringing Enterprise teams together. 

Then well work through a progression of sample apps, each 
building upon the last to show more sophisticated pipelines and 
explore more components of Cascading (Word Count, Cus¬ 
tomized Operation, Joins at scale), along with comparisons to 
similar constructs in Hive and Pig. Well summarize with a full im¬ 
plementation of TF-IDF (search index) in Cascading, and show 
how to instrument and test the app. 

Branching out into other languages, we will compare Word 
Count also in Cascalog and Scalding, then work through exam¬ 
ples using ANSI SQF (Fingual) and PMMF (Pattern). Well con¬ 
clude by reviewing a case study: Using Cascalog on Open Data 
from the City of Palo Alto. 

Prerequisites: Bash command line, some programming in Java, 
plus familiarity with Git/GitHub. 

Note: This class is part lecture and part hands-on; you are re¬ 
quired to bring a laptop. 

Level: Intermediate 

Engineering Your Approach to Big Data Solutions WM EH 

Tony Shan 

This tutorial introduces Big Data Engineering (BDE), which is de¬ 
fined as the practical application of a systematic, disciplined, quan¬ 
tifiable approach to the analysis, design, construction, operation 
and maintenance of Big Data solutions. BDE is a holistic method fo¬ 
cusing on eight crucial areas: Methodology, Program, Governance, 
Resources, Quality, Risk Mitigation, KPI & Financials, and Practice. 

BDE also systematically addresses the life cycle of Big Data so- 
lutioning in 12 stages: Plan, Requirement, Analysis, Modeling, 
Platform, Design, Development, Integration, Testing, Runtime, 
Deployment, and Operation. Each of these 12 stages comprises 
individual elements as subdisciplines. For example, the NoSQL 
platform options include key-value, column-based, document- 
oriented, graph, NewSQL and in-memory stores. Case studies and 


working examples will be discussed in great detail in the session 
to illustrate the pragmatic use of BDE in real-world implementa¬ 
tions. Best practices and lessons learned are articulated as well.? 

Level: Intermediate 


Introduction and Best Practices for Storing and Analyzing 
Your Data with Apache Hive EH 

Mark Grover 

This tutorial on Apache Hive will introduce Hive, as well as the 
best practices for storage and data analysis in Hive. Hive is an 
open-source data-warehousing system based on top of Apache 
Hadoop that lets you query, mine and analyze the data stored in 
Hadoop clusters using familiar SQL-like queries. 

This tutorial will go through a hands-on exercise on how users 
can use Hive queries to perform data analysis. Because not all 
analysis can be expressed using SQL-like queries, the workshop 
will cover how to write, test and use User Defined Functions and 
User Defined Aggregate Functions in Hive. This tutorial will then 
go through some of the best practices related to partitioning, 
bucketing and joining various datasets in Hive. 

You will also learn how to leverage other technologies in the 
Hadoop ecosystem, such as plugging in Map/Reduce scripts from 
Hadoop directly into their Hive queries, and how to how to inte¬ 
grate HBase with Hive to share the data across the two systems. 
The tutorial will wrap up with a question-and-answer session. 

Note: For this tutorial, you are required to bring in a laptop 
with Apache Hadoop and Apache Hive installed on it. The best 
and easiest way to get started is to download a Demo VM with 
Hadoop and Hive installed and configured on it. You may down¬ 
load such a Demo VM from ccp.cloudera.com/display/SUP¬ 
PORT/Cloudera , s+Hadoop+Demo+VM+for+CDH4. VMware, KVM 


“If you’re in or about to get into Big Data, this is the 
conference to go to.” 

—Jimmy Chung, Manager, Reports Development, Avectra 
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Tutorials 


Tuesday, October 15 


and VirtualBox images are available at that link as well. Also, 
please clone the Git repository at github.com/markgrover/bdtc- 
hive on the demo VM before you come to the tutorial. 

Level: Intermediate 

NoSQL for SQL Professionals 

Dipti Borkar 

With all of the buzz around Big Data and NoSQL (non-rela¬ 
tional) database technology what actually matters for today’s SQL 
professional? Learn more in this tutorial about Big Data and 
NoSQL in the context of the SQL world, and get to what’s truly im¬ 
portant for data professionals today. In this tutorial, we will discuss: 

• The main characteristics of NoSQL databases 

• High-level architectural overviews of the most popular 
NoSQL databases 

• Differences between distributed NoSQL and relational databases 

• Use cases for NoSQL technologies, with real-world examples 
from organizations in production today 

Finally, we will drill down into Couchbase Server and its under¬ 
lying distributed architecture, with a hands-on tour of how 
NoSQL databases like Couchbase work in a production environ¬ 
ment, including online rebalancing while adding nodes to a clus¬ 
ter, indexing and querying, and cross-data-center replication. 
Level: Intermediate 

Advanced 

Cassandra + S3 + Hadoop = Quick Auditing and Analytics 

C3J3 n#22i 
Anton Yazovskiy 

The Cassandra database is an excellent choice when you need 
scalability and high availability without compromising perform¬ 
ance. Cassandra’s linear scalability, proven fault tolerance and 
tunable consistency, combined with its being optimized for write 
traffic, make it an attractive choice for performing structured log¬ 
ging of application and transactional events. But using a colum¬ 
nar store like Cassandra for analytical needs poses its own 
problems, problems we solved by careful construction of Column 
Families combined with diplomatic use of Hadoop. 

Our system needed to support both a high volume of struc¬ 
tured, distributed writes as well as broad analytical capabilities. 
Unlike SQL databases, Cassandra does not support ad hoc queries, 
and data typically needs to be properly structured and denormal - 
ized at write time. At the same time, decisions need to be made 
depending on how often the data is queried, how stale the data 
can be, and the allowable latency before results are returned. Our 
system handles these different use cases by delegating certain re¬ 
porting tasks to Hadoop while keeping some in Cassandra itself. 

This tutorial focuses on building a similar system from scratch, 
showing how to perform analytical queries in near real time and 


still getting the benefits of the high-performance database engine 
of Cassandra. The key subjects are: 

• The splendors and miseries of NoSQL 

• Apache Cassandra use cases 

• Difficulties of using Map/Reduce directly in Cassandra 

• Amazon cloud solutions: Elastic MapReduce and S3 

• “Real-enough” time analysis 

In particular, the tutorial dives into ways of handling different 
kinds of semi-ad hoc queries when using Cassandra, as well as 
the pitfalls in designing a schema around a specific analytics use 
case. Some attention will be paid to dealing with time-series data 
in particular, which can present a real problem when using Col¬ 
umn-Family or Key-Value store databases. 

Level: Advanced 

Programming with Scalding and Algebird EH 

Krishnan Raman 

This is a hands-on coding tutorial. We will code up a few Scald¬ 
ing programs in different domains: portfolio optimization, 
healthcare, cosine similarity and random forests. While Scalding 
looks like a thin Scala API atop Cascading, this appearance is de¬ 
ceptive. The power of Scala, combined with the mapping, group¬ 
ing and joining primitives in Scalding, along with the Algebird 
abstract algebra library, allow for a whole new level of flexibility 
with Big Data. Matrix operations in Scalding are powered by Alge¬ 
bird, and using large-dimension matrices as a primitive, we can 
tackle problems in diverse domains that employ linear algebra 
over very large datasets in a batch mode. 

Note: You are expected to have installed Scala, Scalding and Al¬ 
gebird on your laptop before the tutorial commences. Access the 
slides and Scala code here: github.com/krishnanraman/bigdata. 
Level: Advanced 


“Great networking opportunities. 

—TK Lee, Education Technologist, Penn State University 
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Overview 
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Overview 

First Steps to Big Data from MySQL m 

Dave Stokes 

MySQL is the ubiquitous database on the Web, and just about 
every organization has a copy running someplace. But how do 
you get your information in your MySQL instances into a Big Data 
store? This class covers basic data warehousing that can be done 
with the community edition, column storage engines, and finally 
moving into Hadoop (more than 80% of all Hadoop sites feed data 
from MySQL). So if you need to plunge into deep data but need 
some guidance, please attend this class. 

Level: Overview 


How to See and Understand Big Data 


Most Popular! 


Jock Mackinlay 

Visual analysis is an iterative process that exploits the power of 
the human visual system to help people work with all kinds of 
data. When data is big, people must overcome the challenges of 
wide data, tall data, and data from multiple sources, often coming 
in fast and furiously. Attend this class to learn how people work¬ 
ing with data can address these challenges. The key technique is 
to use multiple coordinated views of data during visual analysis 
and storytelling with data. 

You’ll learn: 

• What research and practice have taught us about designing 
great visualizations and dashboards 

• Fundamental principles for designing effective coordinated 
views for yourself and others 

• How to systematically analyze data from multiple databases 
using your visual system 

• The instructor works for Tableau Software, a provider of data-vi¬ 
sualization solutions. 

Level: Overview 


Implementing a Simple Mongo Application 

Deep Mistry 

One simple app, three different technologies deployed locally 
to two different clouds. The application is executed, the source is 
examined, the approach is compared, the tools are demonstrated, 
and your questions are answered. 

Level: Overview 


Graph Database Use Cases WM 

Max De Marzi 

Learn from existing open-source projects how to build proof- 
of-concept solutions and how to add a new tool to your develop¬ 
ment toolkit. Social Networks, Recommendation Engines, 
Personalization, Dating Sites, Job Boards, Permission Resolution 
and Access Control, Routing and Pathfinding, and Disambigua¬ 


tion are just a few of the uses cases that lend themselves well to 
graph databases. 

Level: Overview 

Introduction to Apache Pig, Parts I & II MS 

Jeffrey Breen 

This two-part class provides an intensive introduction to Pig 
for data transformations. You will learn how to use Pig to manage 
data sets in Hadoop clusters, using an easy-to-learn scripting lan¬ 
guage. The specific topics of the 120-minute class will be cali¬ 
brated to your needs, but we will generally cover: 

• What is Pig and why would I use it? 

• Understanding the basic concepts of data structures in Pig 

• Understanding the basic language constructs in Pig. 

Well also create basic Pig scripts. 

Prerequisites: This class will be taught in a Linux environment, 
using the Hive command-line interface (CLI). Please come pre¬ 
pared with the following: 

• Linux shell experience; the ability to log into Linux servers and 
use basic Linux shell (bash) commands is required 

• Basic experience connecting to an Amazon EC2/EMR cluster via SSH 

• Windows users should have a knowledge of Cygwin and Putty 

• A basic knowledge ofVi would be helpful but not necessary 

Also, bring your laptop with the following software installed in 
advance: 

• Putty (Windows only): You will log into a remote cluster for this 
class. Mac OS X and Linux environments include SSH (Secure 
Shell) support. Windows users will need to install Putty. 

• A text editor: An editor suitable for editing source code, such as 
SQL queries. On Windows, WordPad (but not Word) or 
Notepad++ (but not Notepad) are suitable. 

Level: Overview 


“The conference is great for learning about the theory, concepts 
and technology of Big Data.” 

—Waleed Sarwani, Founder Sarwani Systems 
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Technical Classes 


Overview 


Managing a World of Data: Geospatial Best Practices 

l.'/dVJ 1777773 
Norman Barker 

Everything happens somewhere, and organizations are realiz¬ 
ing how important a role location, surroundings and time play in 
decision-making, including: 

• Which driving route is least congested? 

• Where’s the closest top-rated French restaurant? 

• Are field equipment malfunctions related to altitude or time of 
day? 

• What people will be in a particular location given their current 
activity? 

The challenge is how to store, index and query the massive 
quantities of spatial and temporal data generated via mobile 
phones, cameras, computers, sensors and Internet-enabled ap¬ 
pliances. This class will present the best practices emerging from 
the field of geospatial data and the specialized database systems 
developed to manage it. We will cover: 

• Apps that inspire: novel ways geospatial data is being applied in 
government and industry 

• Geospatial data management standards such as GeoJSON and 
GML; which should you follow? 

• Geospatial basics: bounding box and proximity searches 

• Advanced geospatial operations, including storing complex 
geometries, temporal and geo-metadata; performing bounding 
polygon and radius, intersections, and buffering; and best prac¬ 
tices for scaling and partitioning geospatial data 

• Comparison of specific SQL and NoSQL geo-indexing libraries 

Attend this class to learn how to best store, index and query the 
spatial and temporal data generated by mobile devices, sensor 
networks, and Internet-enabled appliances. Fundamentals on 
geospatial data-management standards (GeoJSON, GML), 
bounding box and proximity searches, and the storage of geome¬ 
tries and geo-metadata will also be covered. Databases will in¬ 
clude PostGIS and CouchDB (running on Cloudant). 

Level: Overview 

Pattern: A Machine-Learning Library for Cascading, 
Migrating PMML Models to Hadoop EH EH 

Paco Nathan 

Pattern is an open-source project that takes models trained in 
popular analytics frameworks, such as SAS, R, SPSS, MicroStrat- 
egy, etc., and runs them at scale on Apache Hadoop. This ma¬ 
chine-learning library works by translating PMML—an 
established XML standard for predictive model markup—into 
data workflows based on the Cascading API in Java. 

PMML models can be run in a pre-defined JAR file with no 
coding required. PMML can also be combined with other flows 
based on ANSI SQL (Lingual), Scala (Scalding), Clojure (Casca- 


log), etc. Multiple companies have collaborated to implement 
parallelized algorithms: Random Forest, Logistic Regression, 

SVM, K-Means, Hierarchical Clustering, etc., with more machine¬ 
learning support being added. Benefits include greatly reduced- 
development costs and less licensing at scale while leveraging a 
combination of Apache Hadoop clusters, existing intellectual 
property in predictive models, and the core competencies of ana¬ 
lytics staff. 

Sample code in the class will show apps using predictive mod¬ 
els built in R for anti-fraud classifiers. In addition, examples will 
show how to compare variations of models for large-scale cus¬ 
tomer experiments. Portions of this material come from the book 
"Enterprise Data Workflows with Cascading." 

You will learn how to migrate predictive models to run on 
Hadoop clusters at scale, how to leverage PMML for customer ex¬ 
periments, and how the notion of "ensembles" has enhanced pre¬ 
dictive power: Netflix Prize, Kaggle, KDD, etc. 

Level: Overview 

Untangling the Relationship Hairball with 
a Graph Database EM 

Max De Marzi 

Not only has data gotten bigger, it’s gotten more connected. 
Make sense of it all and discover what these Big Data connections 
can tell you about your users and your business. Come to this 
class to learn some of the different use cases for graph databases, 
and how to spot the non-obvious opportunities in your data. 

Level: Overview 

Using Hadoop to Lower the Cost of Data Warehousing: 

The Paradigm Shift Underway m 

Dave Jespersen 

Data warehouses are bursting from increased data volume, 
and new sources of data are making traditional approaches to 
data analysis costly and slow. Typically, analysts define the prob¬ 
lem, identify data samples and pull the data through an ETL (ex¬ 
tract, transform and load) process. But now, Hadoop is changing 
the data-warehousing landscape by improving data archiving and 
lowering costs by offloading data warehouse processing. A 
Hadoop platform enables companies to easily scale as the vol¬ 
ume, velocity and variety of data continues to increase while pro¬ 
viding even higher-quality results. 

This class will cover the operational cost of deploying Hadoop 
relative to more traditional data-warehousing implementations. 
We will cover real-world customer use cases and demonstrate 
how dramatic cost savings (often a magnitude of savings) were 
achieved through properly deployed Hadoop implementations. 
Level: Overview 
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Intermediate 


Intermediate 

Analytics Maturity Model 

John A. De Goes 

Every company is at a different stage in leveraging analytics to 
improve their operational efficiency and product offerings. In this 
class, you will learn an eight-stage analytics maturity model that 
companies can use to determine how far they are from the most 
analytical companies. 

Level: Intermediate 


Analyzing Tweets with HBase, Parts I and II EM MM 

Sameer Farooqui 

This two-hour class will cover how to use the Twitter API to 
download and model tweets in HBase, and then run natural-lan¬ 
guage processing against them. We will first cover the architecture 
fundamentals of HBase, including log-structured merge trees, 
data models, memstores, HFiles and Bloom filters. Next, tweets 
will be populated into HBase. Finally, we will explore some of the 
more interesting analysis that can be done with the tweets and 
NFP. All code for this class is publicly released under a Creative 
Commons license. 

Level: Intermediate 


Apache Cassandra—A Deep Dive MS 


Most Popular! 


Ben Coverston 

Recently, there has been some discussion about what Big Data 
is. The definition of Big Data continues to evolve. Along with vari¬ 
ety, volume and velocity (which the usual suspects handle well), 
other facets have been introduced, namely complexity and distri¬ 
bution. Complexity and distribution are facets that require a dif¬ 
ferent type of solution. 

While you can manually shard your data (Oracle, MySQF) or 
extend the master-slave paradigm to handle data distribution, a 
modern Big Data solution should solve the problem of distribu¬ 
tion in a straightforward and elegant manner, without manual in¬ 
tervention or external sharding. Apache Cassandra was designed 
to solve the problem of data distribution. It remains the best data¬ 
base for low-latency access to large volumes of data while still al¬ 
lowing for multi-region replication. We will discuss how 
Cassandra solves the problem of data distribution and availability 
at scale. 

This class will cover: 

• Replication • The Read Path 

• Data Partitioning • Multi-Datacenter Deployments 

• Focal Storage Model • Upcoming Features (1.2 and beyond) 

•TheWrite Path 

For the most benefit from this class, attend the “Getting Started 
with Cassandra” workshop. 

Level: Intermediate 


Building an Impenetrable ZooKeeper MM 

Kathleen Ting 

Apache ZooKeeper is a project that provides reliable and 
timely coordination of processes. Given the many cluster re¬ 
sources leveraged by distributed ZooKeeper, it’s frequently the 
first to notice issues affecting cluster health, which explains its 
moniker: “The canary in the Hadoop coal mine.” 

Come to this class and you will learn: 

• How to configure ZooKeeper reliably 

• How to monitor ZooKeeper closely 

• How to resolve ZooKeeper errors efficiently 

Culling from the diverse environments weVe supported, we will 
share what it takes to set up an impenetrable ZooKeeper environ¬ 
ment, what parts of your infrastructure specifically to monitor, 
and which ZooKeeper errors and alerts indicate something seri¬ 
ously amiss with your hardware, network or HBase configuration. 
Level: Intermediate 

Building Your Own Facebook Graph Search with Cypher 
and Neo4j fMM 

Max De Marzi 

Team how to create your own Facebook Graph Search or how 
to build a similar system with your own company data. Also learn 
how to interpret natural language into a grammar and use it to 
build Cypher queries to retrieve data from a graph. Knowledge of 
Natural Fanguage Processing not required. The instructor works 
for Neo Technology, creators of the Neo4j Graph Database. 

Level: Intermediate 


“There are some great classes covering a wide range of areas, 
from technical to business-related, general priniciples and 
specific technologies. It’s a good value for the cost.” 

— Nicolas Metis, Software Engineer, CableLabs 
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Data Modeling for Chat Messages with Cassandra EM EM 

Ameet Chaubal 

Cassandra excels at many aspects of database design, such as 
fast ingestion, replication and distributed architecture. However, 
there are some other features of databases and modeling at which 
Cassandra shines and lends itself to interesting use cases. This 
class will focus on some of these features, specifically automatic 
ordering and automatic grouping of related data elements. 

In this hands-on class, we will explore using Cassandra for a 
chat-message storage/retrieval application. The class will set up a 
Cassandra instance in a VM, and demonstrate Schema design 
using Cassandra CLI and interaction with it using a Java client. 
You will gain an appreciation for Cassandra, its application design 
in Java, and its data-modeling intricacies. 

Prior experience in Java will be helpful; however, even without 
it, watching the demonstration will allow you to gain from the ex¬ 
ercises. 

Please note: You will need a laptop with VMware, CentOS and 
Eclipse installed. 

Level: Intermediate 

Extending Your Data Infrastructure with Hadoop 


Most Popular! 


Jonathan Seidman 

Hadoop provides significant value when integrated with an ex¬ 
isting data infrastructure, but even among Hadoop experts there’s 
still confusion about options for data integration and business in¬ 
telligence with Hadoop. This class will help clear up the confusion. 
You will learn: 

• How can I use Hadoop to complement and extend my data in¬ 
frastructure? 

• How can Hadoop complement my data warehouse? 

• What are the capabilities and limitations of available tools? 

• How do I get data into and out of Hadoop? 

• How can I use my existing data-integration and business 
intelligence tools with Hadoop? 

• How can I use Hadoop to make my ETL processing more 
scalable and agile? 

We’ll illustrate this with an end-to-end example data flow 
using open-source and commercial tools, showing how data can 
be imported and exported with Hadoop, ETL processing in 
Hadoop, and reporting and visualization of data in Hadoop. You 
will also learn recent advancements that make Hadoop an even 
more powerful platform for data processing and analysis. 

Level: Intermediate 


Hadoop Backup and Disaster Recovery 101 EM 

Jairam Ranganathan 

Any production-level implementation of Hadoop must have its 
data protected from threats. Threats to data integrity can be 
human-generated (malicious/unintentional) or site-level (power 
outage, flood, etc.). As soon as you start to identify these threats, 
it’s important to develop a backup or disaster-recovery solution 
for Hadoop! 

In this class, you will learn the unique considerations for 
Hadoop backup and disaster recovery, as well as how to navigate 
the common issues that arise when architects and developers 
look to protect the data. 

We’ll cover: 

• How to model your backup/disaster-recovery solution, consid¬ 
ering your threat model and specifics around data integrity, 
business continuity, and load balancing. 

• Best practices and recommendations, highlighting Hadoop in 
contrast to traditional SAN/DB systems; replication versus 
“teeing” models for ensuring DR; replication scheduling; Hive; 
HBase; managing bandwidth; monitoring replication; using 
one’s secondary beyond replication; and a survey of existing 
tools and products that can be used for backup and DR 

After taking this class, you should be able to explain to your 
organization the right way to effect a backup or data recovery 
solution for Hadoop. 

Level: Intermediate 


“The conference has good content and selection of speakers, 
and is well organized in general.” 

—Volker Schulz, VP of Engineering, Idea5 
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Hadoop by Example EM 

Serge Blazhievsky 

This class is designed to demonstrate the most commonly 
used Map/Reduce design patterns for various problems. Perform¬ 
ance and scalability will be taken into consideration. 

The class will present a general overview of the problems that 
can be solved using Map/Reduce, scalability and performance 
tuning for clusters of different sizes. The techniques described 
here can be used on all Hadoop distributions. 

The following technical problems will be covered: 

• “Hello world!” of the Map/Reduce universe—a word count ex¬ 
ample 

• Mapping only Map/Reduce jobs and their usage for ETL-type 
jobs 

• Global sorting techniques 

• Sequencing files and its usage in Map/Reduce jobs 

• Mapping files and its usage in Map/Reduce jobs 

• Reduce-side join and its advantages and limitations 

• Map-side join and its advantages and limitations 

Each technique will be provided with a code example that can 
be used as a template. No prior knowledge about the topic is re¬ 
quired; however, some Java knowledge is recommended. 

Level: Intermediate 

Hadoop Design Patterns EM EM 

Serge Blazhievsky 

This class is designed to demonstrate how to solve the most 
common problems with Map/Reduce technologies, and to opti¬ 
mize your Map/Reduce jobs to run efficiently on a given Hadoop 
cluster. The differences among types of joins will be described 
with real code examples. After this class, you will understand: 

• Different types of joins in Map/Reduce 

• ETL design patterns 

• Sort and secondary sort 

• Data-driven design patterns 
Level: Intermediate 

HBase Schema Design Done Right EM 

Michael Segel 

Schema design is one of the areas which has often been over¬ 
looked yet can play a critical role when determining overall appli¬ 
cation performance when working with HBase. This class is based 
on practical experience and lessons learned on the importance of 
breaking away from the traditional relational-model approach to 
schema design, and is appropriate to all levels of individuals who 
are looking to improve their knowledge of HBase and of designing 
effective schemas. 

We will cover the following: 

• Tradeoffs between a flattened schema vs. storing complex 


structures in cells 

• The use of column families 

• Different key designs, including hashing, salting and com¬ 
posite keys 

• Secondary Indexing 

It is assumed that you have a basic understanding of the basic 
fundamentals of HBase and Hadoop. 

Level: Intermediate 

HBase Use Cases EM 

Justin Hancock 

The class will be an overview of the use cases for HBase. There 
will be an initial overview of HBase and its architecture, recapping 
key concepts such as column families, region servers and master. 
From this point, we will then move onto the use cases HBase is 
best suited to, such as high write loads with fast lookups, and the 
types of application that can be developed on HBase, such as Time 
Series Databases. We will also describe some of the use cases that 
HBase is not suited to, which helps avoid dissonance between 
technology choice and solution requirements. This includes dis¬ 
cussion of why it isn’t suitable for relational analytics or OLTP. 

Finally, a recap of things to consider before embarking on an 
HBase implementation project, which includes design activities, 
deployment, tuning and administration considerations. You will 
leave with a good overview of HBase and its use cases. These will 
be useful for making informed investigations of technology selec¬ 
tion, ensuring that if HBase is chosen, it will satisfy business and 
technical requirements. 

Level: Intermediate 


“The hands-on tutorials were practical and useful. They had real 
examples, which is exactly what I came for and I was not 
disappointed. Also, the Women in Big Data Luncheon alone 
was almost worth the cost of admission.” 

— Naomi Anderson Sr. Software Developer 
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Intro to Machine Learning: A Crash Course, Parts I and II 

l.'/dVJ 1777773 
Paco Nathan 

This two-part, 120-minute class provides a crash-course in¬ 
troduction to Machine Learning. Well start by defining the ter¬ 
minology, making comparisons with the related fields of 
statistical inference and optimization theory, and then review 
some history of ML, fromearly neural nets onward. Well con¬ 
sider a process for feature engineering,with emphasis on using 
tools for data prep and visualization, plus how to grapple with 
dimensional reduction. 

The remainder of the practice will be divided into three parts: 
Representation: a survey of useful algorithms, including proba¬ 
bilistic data structures, text analytics and NLP, plus issues to con¬ 
sider; Evaluation: distinguishing how some methods work better 
for given use cases, including issues of overfitting, bias, etc., and 
the use of quantitative measures; and Optimization: methods for 
improving on a good thing, including how to move from graph 
theory to sparse matrices, ensemble models, plus a look at ML 
competition platforms. Well conclude with suggestions for where 
to continue further studies. 

Prerequisites: some familiarity with programming, probability, 
statistics, linear algebra, and calculus. We will be programming in 
R and Python, along with some bits of Hadoop and Spark. 

Note: This class is part lecture and part hands-on; you are re¬ 
quired to bring a laptop. 

Level: Intermediate 


ery. Then well describe the most common use cases for real-time 
queries and products implementing these capabilities. We will 
also describe the role of streaming, common use cases, and prod¬ 
ucts in the space. 

The majority of time will be dedicated to the usage of HBase as 
a foundation for the real-time data process. We describe several 
architectures for such implementation, and a high-level design 
and implementation for two examples: system for storing and re¬ 
trieving images, and using HBase as a back end for Lucene. 

Level: Intermediate 

Running Mission-Critical Applications on Hadoop 

Dave Jespersen 

This class will look at what is involved when you move Hadoop 
from a lab environment to actual deployment in production. We 
will cover the critical enterprise-grade features like data integra¬ 
tion, data protection, business continuity and high availability, 
and discuss the ways you can accomplish these in your environ¬ 
ment. We will also identify potential stumbling blocks, identify 
what a platform can or can’t provide, and help determine the 
scope and level of customization necessary to make your deploy¬ 
ment successful. 

At the end of the class, you will better understand how to move 
Hadoop from the test bed to production deployment, what is in¬ 
volved in the process, and how to run a mission-critical Hadoop en¬ 
vironment. Where appropriate, there will be real-world examples. 

Level: Intermediate 


Proper Care and Feeding of HBase Coprocessors EM 

Michael Segel 

Coprocessors are a relatively new feature within HBase. While 
they are capable of providing useful and powerful performance 
improvements, if they are not designed properly, they can have 
extreme detrimental effects. Like the Tribbles of “Star Trek” or the 
Gremlins, one must use extreme caution and follow good prac¬ 
tices, or else bad things can happen... 

This class provides an introduction to Coprocessors, focusing 
on the potential problems that can arise from poor design as well 
as issues with the current implementation. This class is geared to¬ 
ward the more experienced HBase users and will provide some 
common examples of how Coprocessors are being used today, as 
well as some potential future use cases. 

Level: Intermediate 


Most Popular! 


Real-Time Hadoop MSI 

Michael Segel 

We will start with a short introduction to different approaches 
to using Hadoop in the real-time environment, including real¬ 
time queries, streaming, and real-time data processing and deliv- 


“Great for high-level learning!” 

—Carol Long, Executive Acquisitions Editor, Wiley Publishing 
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Seven Deadly Hadoop Misconfigurations EM 

Kathleen Ting 

Misconfigurations and bugs break the most Hadoop clusters. 
Fixing misconfigurations is up to you! 

Attend this class to learn how to get your Hadoop configuration 
right the first time. In some support contexts, a handful of common 
issues account for a large fraction of issues. That is not the case for 
Hadoop, where even the most common specific issues account for 
no more than 2% of support cases. Hadoop errors show up far from 
where you configured, making it hard to know what log files to ana¬ 
lyze. It pays to be proactive. Come to this class! 

Level: Intermediate 

Simple Yet Efficient Web Extraction with OXPath, Part I 
E3J3 ESI 

Tim Furche 

Big Data has already changed how we make decisions, whether 
on pricing, recommendations or investment. However, access to 
such Big Data is often expensive or limited to large organizations 
that collect it. Though much is available on the Web, it is often 
only available through Web Forms and HTML pages. 

In this two-part class, we give a thorough overview of large- 
scale data extraction from the Web, as well as its challenges. The 
first part gives an overview of existing tools and walks through 
real-life examples of manual wrappers. In the second part, we 
delve deeper into data extraction, and discuss common patterns 
and fallacies when creating, maintaining, and running large-scale 
data-extraction systems. 

In the first part, we start with an overview of traditional ap¬ 
proaches, outlining their strengths and limitations to enable at¬ 
tendees to more easily decide what tools are most appropriate for 
their needs. We walk through real-life examples for manual wrap¬ 
per creation with XPath and WebDriver, the emerging W3C stan¬ 
dard for programmatic browser control. 

Finally, we show that extracting Big Data from the Web doesn’t 
have to be hard or costly. We will show you how to extract data 
with just a little knowledge of XPath. That’s all you need to get 
started with OXPath, a high-level, high-performance extension of 
XPath for efficient data extraction from any website. OXPath ex¬ 
tends XPath with four well-defined extensions, including the abil¬ 
ity to simulate user actions and to select elements of a Web page 
through their appearance. This allows for easy navigation through 
complex Web applications, and reduces maintenance in the face 
of structural page changes. 

Level: Intermediate 


The Hadoop Ecosystem: Putting the Pieces Together 

fi'/dVJ 1777773 

Jonathan Seidman 

Everybody’s talking about Hadoop and Big Data, and a number 
of companies are undertaking efforts to explore how Hadoop can 
be applied to optimize their data-management and processing 
processes, as well as address challenges with ever-growing data 
volumes. Unfortunately, there’s still a lack of understanding of 
how Hadoop can be leveraged, not to mention how the tools in 
the Hadoop ecosystem can be used together to implement data- 
processing pipelines. 

This class will seek to provide clarity by first discussing some 
typical real-world use cases for Hadoop that are allowing compa¬ 
nies to address challenges and derive tangible value. We’ll then 
dive deeper to discuss specific tools in the Hadoop ecosystem 
such as Hive, Pig, Oozie, Flume, Sqoop and Mahout. More impor¬ 
tantly, we’ll discuss some example architectures to understand 
how these tools can be used together to create processing 
pipelines that implement some of these use cases. Since Hadoop 
isn’t a panacea, we’ll also discuss criteria for determining when 
Hadoop is a suitable fit and when it isn’t, as well as some sugges¬ 
tions for getting started with a Hadoop pilot project. 

Level: Intermediate 


There is very little vendor pitch and something for everyone. 

— Mani Sivagnanam, Sr. Manager, Marketing Systems, Staples 
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Intermediate - Advanced 


Understanding MongoDB: New Features Explored Through 
Code HEMEiM 

Jonathan Freeman 

This class is geared toward understanding some of the new 
features and enhancements that were released in MongoDB 2.4. 
We'll explore these concepts by building an application that uses 
text search, geospatial queries and the aggregation framework. 
The application will be built entirely in JavaScript utilizing 
Node.js and jQuery. However, the emphasis will be on MongoDB, 
so those who are not experts in JavaScript need not worry. 

Level: Intermediate 

Advanced 

Building Applications That Predict User Behavior Through 
Big Data Using Open-Source Technologies WM EM 

Simon Chan 

One of the biggest challenges for data engineers building real- 
world predictive applications with Big Data is the steep learning 
curve of multiple data-processing frameworks, learning algo¬ 
rithms and scalable programming. 

In this class, you will get hands-on instructions for data engi¬ 
neers to add predictive features, such as personalization, recom¬ 
mendation and content discovery, to your applications using Big 
Data. The class will begin with a brief overview of scalable ma¬ 
chine learning for Big Data. You will then see demonstrations with 
the use of open-source technologies such as Hadoop, Cascading, 
Scalding and PredictionlO with live sample codes. A number of 
collaborative filtering algorithms will be explained. 

You will also see the use of open-source user-friendly control 
interfaces to evaluate, compare, select and deploy learning algo¬ 
rithms; tune hyperparameters of algorithms manually or auto¬ 
matically; and review the predictive model training status. By the 
end of the class, you will master the core concepts of Machine 
Learning and be able to apply scalable algorithms into real soft¬ 
ware production environment. 

Level: Advanced 

Data Modeling and Relational Analysis in a NoSQL World 


Most Popular! 


Michael Miller 

The new wave of NoSQL technology is built to provide the flex¬ 
ibility and scalability required by agile Web, mobile and enter¬ 
prise applications. Interestingly, any system that supports 
chained Map/Reduce processing (specifically Map/Reduce Map) 
fulfills the basic query requirements of a SQL engine. Therefore, 
we will work to help you bridge the gap between SQL, relational 
(big) data, and the brave new world of NoSQL. 

In this class, you will learn how to model real-world relational 


data in a modern document database. We next go on to compile 
various SQL operations (SELECT, SUM, AVG, JOIN, etc.) into ex¬ 
ceptionally simple Map/Reduce programs. We finish with a study 
demonstrating the performance, scalability and “time-to-value" 
benefits of this approach, specifically the pre-computation of ma¬ 
terialized views. The class will be a mix of chalkboard and interac¬ 
tive demonstrations. 

Prerequisites: Bring a laptop with a modern browser (Chrome, 
Safari or Firefox). Previous experience with basic scripting lan¬ 
guages (e.g., JavaScript) is an advantage but not a requirement. 
All data and code samples will be provided at the beginning of 
the class. 

Level: Advanced 

Getting Started with R and Hadoop, Parts I & IIEZ3 

Jeffrey Breen 

Increasingly viewed as the lingua franca of statistics, R is a nat¬ 
ural choice for many data scientists seeking to perform Big Data 
analytics. And with Hadoop Streaming, the formerly Java-only Big 
Data system is now open to nearly any programming or scripting 
language. This two-part class will teach you options for working 
with Hadoop and R before focusing on the RMR package from the 
RHadoop project. We will cover the basics of downloading and in¬ 
stalling RMR, and we will test our installation and demonstrate its 
use by walking through three examples in depth. 

You will learn the basics of applying the Map/Reduce para¬ 
digm to your analysis, and how to write mappers, reducers and 
combiners using R. We will submit jobs to the Hadoop cluster and 
retrieve results from the HDFS. We will explore the interaction of 
the Hadoop infrastructure with your code by tracing the input 
and output data for each step. Examples will include the canoni¬ 
cal “word count" example, as well as the analysis of structured 


“Big Data TechCon has a great atmosphere, organization and 
extra activies, even a morning run. It’s a great time with many 
experts to learn from.” 

—Jarek Jarcec Cecho, Software Engineer, Cloudera 
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data from the airline industry. 

No specific prerequisite knowledge is required, but a familiar¬ 
ity with R and Hadoop or Map/Reduce is helpful. 

Level: Advanced 

In-Database Predictive Analytics EM 

John A. De Goes 

Predictive analytics have long lived in the domain of statistical 
tools like R. Increasingly, however, as companies struggle to deal 
with exploding volumes of data not easily analyzed by small data 
tools, they are looking at ways of doing predictive analytics di¬ 
rectly inside the primary data store. 

This approach, called in-database predictive analytics, elimi¬ 
nates the need to sample data and perform a separate ETL process 
into a statistical tool, which can decrease total cost, improve the 
quality of predictive models, and dramatically shorten develop¬ 
ment time. In this class, you will learn the pros and cons of doing 
in-database predictive analytics, highlights of its limitations, and 
the tools and technologies necessary to head down the path. 

Level: Advanced 

Introduction to Parallel Iterative Machine-Learning 
Algorithms on Hadoop’s Next-Generation 
YARN Framework EM HH 

Josh Patterson 

Online learning techniques, such as Stochastic Gradient De¬ 
scent (SGD), are powerful when applied to risk minimization and 
convex games on large problems. However, their sequential de¬ 
sign prevents them from taking advantage of newer distributed 
frameworks such as Hadoop Map/Reduce. In this class, we will 
take a look at how we parallelize parameter estimation for linear 
models on the next-gen YARN framework Iterative Reduce and 
the parallel machine-learning library Metronome. 

Level: Advanced 

Large-Scale Distributed Queries with Apache Drill EM EM 

Jacques Nadeau 

This class will discuss the architecture behind full ANSI-SQL 
large-query solution for Big Data. Apache Drill, like Hadoop, was 
inspired by a Google white paper; is architected to work with 
nested data structures such as JSON; and can process queries 
against a variety of databases, including MongoDB, HBase and 
Oracle. The class will give an overview of the use cases and pres¬ 
ent the design of some of the critical architectural components. 
Level: Advanced 


Large-Scale, High-Accuracy Entity Extraction Made Easy 

l.'/dVJ THJJ ]3 
Tim Furche 

Big Data is a great opportunity to make smarter decisions. But 
it is also a great challenge, in particular where Big Data comes as 
huge collections of raw text, logs, tweets, etc. Entity and relation 
extraction are crucial components in turning such collections of 
unstructured text into more meaningful, “smart” data. There ex¬ 
ists a plethora of commercial and open-source services or tools 
for extracting entities such as cities, company names, or prices 
from documents. Unfortunately, traditional services have suffered 
from a trifecta of challenges: low coverage, inconsistent accuracy, 
and complex, tool-specific APIs. 

In this class, we will introduce a recent open-source API, 
ROSEAnn, which provides a simple, uniform interface for most of 
the existing extraction services and tools out there. We will walk 
through several scenarios for using ROSEAnn, from detecting 
mentions of a company to more complex cases combining the 
detection of several entity types. In addition to providing a uni¬ 
form interface, ROSEAnn also allows you to easily “scale up” the 
accuracy and coverage of your entity extraction by a smart inte¬ 
gration of an arbitrary number of extraction services. On entity 
types where the underlying services overlap, accuracy is im¬ 
proved (by reconciling the different results); where they don’t 
overlap, coverage is increased. At the end of this class, you will be 
able to deploy automatic entity and relation extraction easily, and 
make use of the integration features of ROSEAnn to achieve entity 
extraction with unparalleled coverage and accuracy. 

Level: Advanced 


“Big Data TechCon is great for beginners as well as 
advanced Big Data practitioners. It’s a great conference!” 

— Ryan Wood, Software Systems Analyst, Government of Canada 
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Making it Real: Leveraging Big Data to Solve Big Problems 

l.'/dVJ 1777773 

Siva Vaidyanatha 

There is a lot of hype around Big Data and its myriad possibili¬ 
ties. This class makes it real and talks about concrete ways in 
which Big Data is used today This is an advanced class that dives 
into architecture and design details. For illustration purposes, it 
uses case studies from the retail and life science industries. This 
class provides an overview of: 

• The current state of technology: predictive analytics and Big Data 

• An overview of a common Big Data stack: the Hadoop ecosystem 

• “Beneath the covers” explanation of how Map/Reduce enhances 
predictive analytics 

• Data collection and munging 

• Model creation 

• Visualization and interpretation 

• Scaling for very large data sets 

• Case studies: deep-dive (includes architecture, design and sam¬ 
ple code) 

• Understanding the consumer “genome”: leveraging predictive 
analytics and machine learning for extreme omni-channel per¬ 
sonalization by understanding consumer behavior 

• Drug repurposing in life sciences: computational methods used 
in the screening phases of drug discovery and drug design. 

Level: Advanced 


Selecting the Right Big Data Tool for the Right Job, and 
Making It Work for You EM I 


Most Popular! 


Eddie Satterly 

This class will focus on the various types of Big Data solu¬ 
tions—from open-source to commercial solutions—and the spe¬ 
cific selection criteria and profiles of each. As in all technology 
areas, each solution has its own sweet spots and challenges either 
in CAP theorem, ACID compliance, performance or scalability. 
This class will provide an overview of the technical tradeoffs for 
the list of solutions in technical terminology. Once the technical 
tradeoffs are reviewed, we will review the cost and value of open- 
source solutions versus commercial software, and the trade-offs 
that folks must take to choose one over the other. 

The next phase will go into great detail on use cases for specific 
solutions based on real-world experience. All of the specific use 
cases have been seen first-hand from our customers. The solutions 
will be reviewed to the level of specific technical architecture and 
deployment details. This is intended for a highly technical audience 
and will not provide any high-level material on the solutions dis¬ 
cussed. It is assumed you have working knowledge of solutions such 
as SQL, NoSQL, distributed file systems and time-series indexes. 

You should also have an understanding of CAP theorem, ACID com¬ 
pliance and general performance characteristics of systems. 

Level: Advanced 


Simple yet Efficient Web Extraction with OXPath, Part II 

Tim Furche 

In the second part of this class, we will look at more complex 
wrappers, as well as the maintenance and management of large- 
scale extraction infrastructure. Well walk you through several ex¬ 
amples on how to create wrappers, driven by real use cases from 
finance and competitive pricing. For these examples, well use the 
OXPath Firefox IDE, which allows for the development of OXPath 
wrappers using familiar Firefox developer tools. We will discuss 
how to make wrappers robust and maintainable through a small 
set of wrapper design patterns. OXPath’s open-source engine is 
able to deal with many of the issues that make Web scraping a 
pain, from buffer management to auto-complete fields. 

However, we will also show the limits of the engine and how to 
deal with them. We will conclude the presentation with best prac¬ 
tices for deploying and scheduling the resulting wrappers, e.g., for 
repeated extraction to keep extracted data up to date. 

Level: Advanced 

Staying Alive: Ensuring Service Availability in Hadoop HEM 

Vini thr a Var adharaj an 

All sorts of things can go wrong in your data center. Ensuring 
that your Hadoop systems stay up through various types of 
threats—from node failures to site failures—is vital toward meet¬ 
ing SLAs and ensuring a high quality of experience for clients of 
your production-level distributed system. We will discuss the vari¬ 
ous threat models that need to be handled, and the elements of 
how to build highly available architectures for key system services. 

In this class, we will shed light on the complexity of what it 
takes to keep different system services alive, starting from core 


“A great learning experience.” 

—Schalk van der Merwe, CEO, RCS Group 
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Hadoop services of HDFS and Map/Reduce, and extending to 
higher-level applications such as Hive. This understanding will 
help you evaluate the risk profile and cost of providing different 
SLAs for your entire Hadoop system. In this class you will learn 
about: 

• Worker node versus master node failure characteristics and tol¬ 
erance levels of various services in the Hadoop ecosystem 

• Vital components of each service as they pertain to availability 
(metadata, data, databases, ZooKeeper quorum, etc.) 

• What it takes to set up backup nodes versus backup clusters 
Level: Advanced 

Time + Space + Transaction: Multidimensional Geospatial 
Analysis with Hadoop MM MM 

Dan Rosanova 

Most organizations do a good job of reporting on the past, 
some do a good job of estimating the future, and few have a total 
understanding of their environment, customers, or business 
based on multidimensional analytics that account for time, space 
and transactions. This class will tie these three areas of analytics 
together to teach attendees how to use commonly available infor¬ 
mation and tools to achieve deep insights. 

The arrival of Big Data blurs the boundaries between report¬ 
ing, software development and analytics. This deeply technical 
class will walk through moving beyond the most common sources 
of transactional reporting and temporal analytics to include 
geospatial dimensions that can unlock the true potential of our 
data. By examining CRM (transactional), Web logs (temporal) and 
IP Geolocation (spatial), this deep-dive will walk developers and 
data scientists through a real example of identifying trends across 
transactions, time and space to discover deeper insights and ac¬ 
tionable intelligence. 

From data procurement, transformation, loading and analysis, 
to display and interpretation, this class will teach attendees how 
to dig deeper into their data by reaching across paradigms and 
using machine-learning algorithms and rich presentation plat¬ 
forms to explore and visualize Big Data. 

Level: Advanced 


“This confernce is very, very well managed.” 

— Rahul Joglekar, Architect 



dows platform. Also included will be data exploration using Excel 
tools, and particularly PowerPivot to visualize and explore data in 
a graphical Excel environment. 

The cloud-based experience of Windows Azure HDInsight al¬ 
lows for a pay-as-you-go model for processing data on 100% 
Apache Hadoop-compatible clusters with zero configuration 
time. This class will be hands-on and require Excel 2013 and ac¬ 
cess to an HDInsight (or other Apache Hadoop compatible) in¬ 
stallation (including HDInsight for Windows). Detailed software 
requirements will be sent with the presentation ahead of time, 
and you are expected to familiar with SQL, Hadoop and Excel. Ac¬ 
tive participation is strongly encouraged but not required, as is 
working in pairs. 

Level: Advanced 


Windows Azure HDInsight and PowerPivot: Cloud-Based 
Data Analysis with Familiar and Friendly Tools MM MM 

Dan Rosanova 

This class will explore the features of Windows Azure HDIn¬ 
sight and Excel PowerPivot, a powerful combination of a cloud- 
based Big Data platform and an Excel front end that allows data 
scientists and analysts to explore semi-structured data in the fa¬ 
miliar Excel tools that many are already experienced with. From 
provisioning and loading data, to structuring for Hive queries and 
connecting to with ExceEs ODBC driver for Hive, this session will 
be a hands-on walkthrough of leading Hadoop tools on the Win- 
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Norman Barker 

Norman is a specialist in developing geospatial data dis¬ 

P 


covery and dissemination products. He has spent his ca¬ 
reer managing and developing geospatial products. He is 



also an open-source developer with contributions to 



MapServer, GDAL and PostGIS. Norman holds a Master’s in mathemat¬ 
ics from the University of Durham, England. He is an avid rugby player. 


r l^ Serge Blazhievsky 

Serge is Principal Software Engineer at Nice Systems, and is 
/ J an experienced developer and architect with a rich back¬ 
ground in C++/Java and distributed systems. Nice Systems 
uses Hadoop infrastructure for various data-processing 
needs. His previous company used Hadoop infrastructure for all report¬ 
ing needs. Before that, Serge designed Hadoop infrastructure used for 
Internet crawling and Web-page analysis. Serge holds a Master’s Degree 
in Computer Engineering from Santa Clara University. Serge is a regular 
contributor to various Hadoop conferences, including the Hadoop User 
Group at Yahoo, the creator of Hadoop. 

Dipti Borkar 

Dipti is Director of Product Management at Couchbase, 
where she is responsible for the company’s flagship prod¬ 
uct, Couchbase Server, and works with customers and 
users to understand emerging requirements for low-la¬ 
tency, scalable data stores. Dipti has deep technical experience in the 
database industry, having worked at IBM as a software engineer and 
Development Manager for the DB2 server team, and then at MarkLogic 
as a Senior Product Manager. 

Jeffrey Breen 

Jeffrey is the Principal of the Think Big Academy at Think 
Big Analytics. Jeffrey has been very active in local user 
groups, has taught and mentored throughout his career, 
and has presented talks recently on R and Hadoop to the 
Data Warehouse Institute, the Chicago Area Hadoop and R User groups, 
and the Boston Predictive Analytics Meetup. Jeffrey has also developed 
and delivered the RHadoop training course, as well as all materials for 
Revolution Analytics. 




Simon Chan 

^ i Simon is a cofounder and product lead of PredictionlO, an 
| open-source Machine Learning Server that empowers pro- 
—^ grammers and data engineers to build smart applications. 
PredictionlO itself is built on top of solid open-source 
technology, such as Scala, Hadoop, Mahout, Cascading and Scalding. 
Starting off as a software engineer after graduating from university, 
Simon founded three tech startups in the past 10 years, in the Bay Area, 
in Hong Kong and in Mainland China. He specializes in machine learn¬ 
ing and recommendation technology, with a strong interest in social 
applications. Simon is a Ph.D. candidate in Machine Learning at Uni¬ 
versity College London, and is a frequent speaker in the Data Science 
community. 




Ameet Chaubal 

Ameet works in the Emerging Technology Innovation 
group at Accenture, and has been architecting, developing 
and implementing solutions leveraging distributed sys¬ 
tems for Fortune 500 clients. He has kick-started the Big 
Data training academy at Accenture, and recently spoke at NYC Cas¬ 
sandra Tech Day. 

Ben Coverston 

Ben currently helps coordinate the training and support 
activities at DataStax. He has more than 15 years of devel¬ 
opment experience, and has written code running on some 
of the largest travel websites in the world. He became inter¬ 
ested in Big Data through his experiences in troubleshooting data-re- 
lated problems in which the velocity and volume of data exceeded the 
capabilities of a single machine. 

Doug Cutting 

Doug is the creator of numerous successful open-source 
projects, including Lucene, Nutch and Hadoop. Doug joined 
Cloudera in 2009 from Yahoo, where he was a key member 
of the team that built and deployed a production Hadoop 
storage and analysis cluster for mission-critical business analytics. Doug 
holds a Bachelor’s degree from Stanford University and sits on the Board 
(and is currently chairman) of the Apache Software Foundation. 

John A. De Goes 

John is CEO and CTO of Precog, and is responsible for lead¬ 
ing the design and development of the company’s data- 
warehousing and analysis platform. He has been working 
professionally in distributed systems design and develop¬ 
ment for more than a decade. 

Author of multiple best-selling technical books, and a major con¬ 
tributor to open source, John has an extensive background in scientific 
and distributed computing, and in large-scale analytics. John is a fre¬ 
quent and well-received speaker at industry events. Recent engage¬ 
ments include DataWeek Conference, Glue Conference, Frontier 
Developers, and NEScala. 

Max De Marzi 

Max is a Software Field Engineer at Neo Technology, where 
he built the Neography Ruby Gem, a REST API wrapper to 
the Neo4j Graph Database. He is addicted to learning new 
things, taking on a challenge and finding (and sharing) 
pragmatic solutions. 

Sameer Farooqui 

Sameer is a freelance Big Data consultant and trainer, spe¬ 
cializing in Hadoop and Cassandra. For the past five years, 
he has deployed various clustering software packages in¬ 
ternationally to clients, including Fortune 500 companies, 
governments, hospitals and banks. Most recently, he was a Systems Ar¬ 
chitect at Hortonworks, where he specialized in designing Hadoop pro¬ 
totypes and Proof-of-Concept use cases. Previously, Sameer worked at 
Accenture's Silicon Valley R&D lab, where he was responsible for study¬ 
ing NoSQL databases, Cloud Computing and Map/Reduce for their 
commercial applicability to emerging Big Data problems. At Accenture 
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Tech Labs, Sameer was the lead engineer for creating a 32-node proto¬ 
type using Cassandra and Amazon Cloud Computing to host 10TB of 
Smart Grid data. He also worked on a more than 30-person team in the 
design phase of a multi-environment Hadoop cluster pilot project at 
NetApp. Before Hortonworks and Accenture, Sameer spent five years at 
Symantec, where he deployed VERITAS Clustering and Storage Founda¬ 
tion solutions (VCS, WR, SF-HA) to Fortune 500 and government 
clients throughout North America. 

f John Foreman 

John is the Chief Data Scientist for MailChimp.com. He 
holds a graduate degree in Operations Research from MIT 
and has worked as an analytics consultant for the Depart- 
t ment of Defense, Coca-Cola, Royal Caribbean Interna¬ 
tional, and Intercontinental Hotels Group. His expertise is in 
optimization modeling, revenue management and predictive modeling. 

For fun, he teaches analytics concepts through narrative fiction at 
Analytics Made Skeezy. 

Jonathan Freeman 

Jonathan is a Developer and Tech Evangelist for Open Soft¬ 
ware Integrators. He's a JavaScript specialist, Big Data and 
NoSQL enthusiast, writer, speaker and jazz musician. You 
can find his articles and blog posts on both the Open Soft¬ 
ware Integrators website and as a guest writer on Info World's Strategic 
Developer blog. 

Tim Furche 

Tim heads the DIADEM lab at Oxford University. He is also 
a fellow at the Oxford-Man Institute for Quantitative Fi¬ 
nance, where he investigates the applications of Big Data 
extracted from the Web for predicting financial indicators, 
funded by the Man group. His research interests include data extrac¬ 
tion, XML and semi-structured data, in particular query evaluation and 
optimization, and advanced Web information systems. He has au¬ 
thored more than 50 peer-reviewed scientific publications, some of 
them cited more than 200 times. His main contributions are on XPath 
optimization and evaluation, on linear time and space querying of 
large graphs, and on languages for Web data extraction, querying, and 
search. From 2004 to 2008, he co-coordinated the working group on 
“Reasoning-aware Querying" in the EU Network of Excellence REW- 
ERSE at the Ludwig Maximilian University of Munich. 

Tim has extensive experience as a lecturer. He has given several lec¬ 
tures and hands-on courses at the University of Munich and at interna¬ 
tional summer schools. He has given dozens of research talks at 
international conferences, including the keynote at the International 
Web Engineering Conference 2011. He has also held several tutorials 
both at academic and developer conferences. 

Mark Grover 

Mark is a Software Engineer at Cloudera and a contributor 
to the Apache Hive open-source project. He is also a sec¬ 
tion author of O'Reilly's book on Apache Hive called “Pro¬ 
gramming Hive." Mark is an active respondent on the Hive 
mailing list and IRC channel. 






Justin Hancock 

Justin is a tech-industry veteran with more than 15 years' 
experience across a number of industries. He has previ¬ 
ously worked as an independent consultant, helping archi¬ 
tect, design and deploy Deutsche Telekom's first Hadoop 
implementation. Prior to joining Cloudera, Justin worked as a Hadoop 
Architect/Developer for DataSift. DataSift had Europe's largest HBase 
cluster with 1PB of storage. Justin now works in Cloudera's Customer 
Operations team as an Operations Engineer supporting Cloudera's cus¬ 
tomers across the planet. 

Justin has spoken at CRM Conferences in Asia, delivered customer 
training in that region, and provided mentoring to new and junior staff. 
Justin is from the U.K. and is married with one daughter. He is a very 
keen cyclist and can be found going very fast downhill on plastic bikes 
either on or off road. 


Dave Jespersen 

Dave brings his deep engineering experience to his role of 
chief customer advocate at MapR Technologies. He en¬ 
riches the customer experience by working with MapR's 
customer base to develop and implement innovative solu¬ 
tions to the complex problems faced by every enterprise. 

He was previously VP of Engineering at MapR, where he led the de¬ 
velopment of MapR's industry-leading products. Dave has 30 years of 
successful enterprise software development experience in both small 
and large companies, including EMC, Sun Microsystems, Sterling Soft¬ 
ware, Spectra Logic, Exabyte and DEC. Dave was educated at Brigham 
Young University, where he earned a BS M.E. and a minor in Computer 
Science. 




Jock Mackinlay 

Jock is Tableau Software's Senior Director of Visual Analy¬ 
sis. At Stanford University, he pioneered the automatic de¬ 
sign of graphical presentations of relational information. 
He joined Xerox PARC in 1986, where he collaborated with 
the User Interface Research Group to develop many novel applications 
of computer graphics for information access, coining the term “Infor¬ 
mation Visualization." Much of the fruits of this research can be seen in 
his book, “Readings in Information Visualization: Using Vision to 
Think." Jock has a Ph.D. in computer science from Stanford University. 


Michael Miller 

Mike is Chief Scientist at Cloudant, where he develops and 
evangelizes the company's technical vision and manages 
long-term product R&D. While at MIT as a Postdoctoral 
Fellow, he cofounded Cloudant after cutting his teeth on 
petabyte-per-second problems at the Large Hadron Collider. Mike 
holds a B.S. in Physics and a B.A in Philosophy from Michigan State 
University, a Ph.D. in Physics from Yale University, and is an Affiliate 
Professor of Particle Physics at the University of Washington. He has 
more than a decade's worth of experience as a builder of the most ex¬ 
treme Big Data systems on earth, as well as extensive experience lectur¬ 
ing on mathematics, physics, data science, and philosophy at the 
graduate and undergraduate level. 






Deep Mistry 

Deep is a consultant at Open Software Integrators, a U.S. 
firm specializing in NoSQL/Big Data development with of¬ 
fices in Chicago and Durham, N.C. Deep has been pro¬ 
gramming for more than eight years and has worked on 
multiple software engineering projects, from developing Big Data train¬ 
ing materials to implementing large data systems requiring Internet- 
speed response times. He has been heavily involved with MongoDB, 
Neo4j, Couchbase and Hadoop since their births onto the Big Data 
scene. You can find his white papers and Big Data blogs on the Open 
Software Integrators website. Deep received his Masters in Computer 
Science from North Carolina State University. 

Jacques Nadeau 

Jacques leads Apache Drill development efforts at MapR 
Technologies. He is an industry veteran with more than 15 
years of Big Data and analytics experience. Most recently, 
he was cofounder and CTO of search engine startup 
YapMap. Before that, he was director of new product engineering with 
Quigo (contextual advertising, acquired by AOL in 2007). He also built 
the Avenue A Razorfish analytics data-warehousing system and associ¬ 
ated services practice (acquired by Microsoft). 

Paco Nathan 

Paco is the Director of Data Science at Concurrent in San 
Francisco and a committer on the Cascading open-source 
project. He has expertise in Hadoop, R, Amazon Web Serv¬ 
ices, machine learning, predictive analytics, and more than 
25 years in the tech industry overall. For more than 10 years, Paco has 
led innovative data science teams, building large-scale apps. He is also 
the author of “Enterprise Data Workflows with Cascading/’ Previously a 
Computer Science instructor at Stanford University, he is now teaching 
professional workshops about data science, Big Data, machine learn¬ 
ing, and more. 





Krishnan Raman 

w 3 Krishnan is a data scientist at Twitter. He was formerly a 
risk quant at Bank of America, an associate at Goldman 
Sachs, and an engineer at Sun Microsystems. His experi- 
Tlfl ence in building the real-time proprietary trading system 
WebET at Goldman Sachs, and concurrent Scala systems to compute 
the conditional value at risk of large credit portfolios at BAC, have put 
him in good stead at the Revenue Quality team at Twitter. His primary 
tools are Scala, Scalding and a dash of statistics and math. He has grad¬ 
uate degrees in math, computer science and mathematical finance 
from the University of Chicago. 

B Jairam Ranganathan 

Jairam is the Director of Product Strategy at Cloudera, 
where he is responsible for planning the road map of 
Cloudera products. Before Cloudera, he spent a decade at 
VMware, where among other things he was one of the de¬ 
velopers on vMotion, storage vMotion, and the distributed manage¬ 
ment framework for vSphere. 

Dan Rosanova 

Dan is a four-time Microsoft Integration MVP with 14 years 
of experience delivering solutions on Microsoft and Solaris 
platforms in the financial services, insurance, banking, 
telecommunications, and logistics industries. He has spe¬ 
cialized in high-volume and low-latency distributed applications. His 
recent focus has been on Hadoop, evolutionary computation and GPU 
computing. Dan speaks frequently on leading-edge technology and its 
impact on the enterprise landscape. Dan is the author of “Microsoft 
BizTalk Server 2010 Patterns/’ Dan is a senior architect in the Technol¬ 
ogy Integration practice at West Monroe Partners, an international, full- 
service business and technology-consulting firm focused on guiding 
organizations through projects that fundamentally transform their 
businesses. 



Josh Patterson 

Josh is a Principal Solution Architect at Cloudera. Prior to 
joining Cloudera, he was responsible for bringing Hadoop 
into the smart grid during his involvement in the openPDC 
project. His focus in the smart grid realm with Hadoop and 
using machine learning to discover and index anomalies in 
time-series data. Josh spent three years as a Principal Solutions Archi¬ 
tect with Cloudera helping Fortune 100 companies build out their 
Hadoop and machine-learning pipelines. 

Josh is a graduate of the University of Tennessee at Chattanooga 
with a Bachelor’s in Business Management and a Master’s of Computer 
Science with a thesis titled “TinyTermite: A Secure Routing Algorithm,” 
where he worked in mesh networks and social insect swarm algo¬ 
rithms. Josh has spent more than 15 years in software development, 
and he continues to contribute to projects such as Apache Mahout, 
Metronome, IterativeReduce, openPDC, and JMotif in the open-source 
community. 



HBase was 


Eddie Satterly 

Eddie is Chief Big Data Evangelist at Splunk, and has 
served in a variety of roles, including developer, engineer, 
architect and CTO over his 23-year career. He has been a 
longtime Big Data user, even before it was the cool thing to 
do. More recently, he was able to revolutionize the way a leading online 
travel agency delivers their core Web applications that resulted in im¬ 
proved user experience. He created a highly scalable and flexible Big 
Data environment using best-in-breed tools, and as a result, was able 
to retire 35 other systems. 

Eddie has done guest lectures at universities, and presents at several 
conferences and symposiums yearly. He is a recognized expert in the 
field of Big Data and has presented at many global conferences on the 
topic. Eddie has a B.S. in Computer Science from Indiana University. 
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Michael Segel 

Michael is a principal consultant with Think Big Analytics. 
As a principal, he is involved in working with clients, assist¬ 
ing with their strategy and implementation of Hadoop. 
Michael is also involved as an instructor with Think Big's 
Academy, teaching courses on Hadoop Development in Java, Hive and 
Pig, along with HBase. 

Prior to joining Think Big, Michael ran his own consulting firm, de¬ 
veloping solutions for customers around the Chicago area. Since 2009, 
Michael has been working primarily in the Big Data Space. He also 
founded the Chicago Hadoop User Group (CHUG). 

Michael received his bachelor's degree in Computer Science from 
the College of Engineering at Ohio State University. 

Jonathan Seidman 

Jonathan is a Solutions Architect on the Partner Engineer¬ 
ing team at Cloudera. Before joining Cloudera, he was a 
Lead Engineer on the Big Data team at Orbitz Worldwide, 
helping to build out the Hadoop clusters supporting the 
data-storage and analysis needs of one of the most heavily trafficked 
sites on the Internet. Jonathan is also a cofounder and organizer of the 
Chicago Hadoop User Group and the Chicago Big Data Meetup, and a 
frequent speaker on Hadoop and Big Data at industry conferences such 
as Hadoop World, Strata and OSCON. 





Tony Shan 

Tony is a renowned thought leader and technology vision¬ 
ary with decades of experience and guru-level knowledge 
on emerging technologies for pragmatic enterprise com¬ 
puting. He has directed and led the life-cycle design of 
complex distributed systems on diverse platforms in Fortune 50 com¬ 
panies and big public-sector organizations. 

He drove innovations with insightful consulting and advising on 
large-scale high-profile projects that won many rewards. He authored 
dozens of top-notch publications and more than 10 books on next-gen¬ 
eration technologies. He wrote multiple entries on architecture and 
methodology to IT encyclopedias. He is a regular keynote speaker and 
chair, moderator, advisor, and organizing committee member in pre¬ 
eminent conferences; an editor and editorial advisory board member 
of IT research journals and books; and a founder of several user groups 
and forums. In particular, he is a world-leading authority in the Big 
Data and cloud space, delivering scores of presentations, panels and 
workshops in various industry events, and serving general chair in in¬ 
ternational conferences. He has extensive speaking experience at con¬ 
ferences and industry events. 


Kathleen Ting 

Kathleen is a Support Manager at Cloudera, is a committer 
on the Apache Sqoop project, and has spoken at many Big 
Data conferences, such as Hadoop World on Map/Reduce; 
at HBaseCon on HBase; at Strange Loop on ZooKeeper; 
and at Hadoop Summit on Sqoop. 




Siva Vaidyanatha 

CTJ Siva is the Chief Technology Officer for the Retail, Con¬ 
sumer Goods, Life Sciences and Logistics Business Unit at 
Infosys. He is a member of the unit Executive Council and 
is also responsible for the setup, organization and delivery 
of technology consulting services aligned with the unit's strategic plans. 

Siva has about 16 years of industry experience, and has spent the 
last 10 years with Infosys in various technology leadership roles. He is 
recognized as a technology visionary and has incubated several innova¬ 
tive technology products and solutions. He has also authored two 
books on next-generation architecture and Big Data. 

Siva is on the Board of Directors of the Parkland Center for Clinical 
Innovation. PCCI's vision is to help transform the delivery of healthcare 
by developing cutting-edge software and analytic methods to improve 
the quality and safety of care at the individual and population levels. 

Siva received his bachelor's degree in Engineering from the Indian 
Institute of Technology, Madras, and his master's degree in Business 
Administration from the SMU-Cox School of Business. 


f * v Vinithra Varadharajan 

| Vinithra is a Software Engineer at Cloudera. She builds 
tools for Hadoop life-cycle management, with a focus on 
K automatic configuration of Hadoop clusters, and setting up 
■ High Availability and Disaster Recovery systems. 


a Anton Yazovskiy 

Anton is a Software Engineer at Thumbtack Technology, 
where he focuses on high-performance enterprise archi¬ 
tecture. He has presented at a variety of IT conferences and 
“Dev Days" on topics such as NoSQL and MarkLogic. 

Anton has been an active user of many NoSQL databases, including 
Cassandra, MongoDB, MarkLogic, Aerospike and HBase. Like many 
people, he learned some of the difficulties behind polyglot persistence 
the hard way, and is hoping his talk will help others avoid making some 
of the same mistakes he made. 



Dave Stokes 

David is a MySQL Community Manager for Oracle, and 
previously was the Certification Manager for MySQL AB. 
He has worked for companies ranging from the American 
Heart Association to Xerox. 






isaar" 


23 • October 15-17,2013 • San Francisco • www.BigDataTechCon.com 


Hotel & Travel 
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Big Data TechCon will be held at the Hyatt Regency, just outside of San Francisco. 


Hyatt Regency Burlingame 

1333 Bayshore Highway 
Burlingame, CA, 94010 
Phone: +1-650-347-1234 
Fax: +1-650-696-2669 

www.sanfranciscoairport.hyatt.com 

Special Big Data TechCon 
Discounted Rates 

Take advantage of special discounted room rates at the 
Hyatt Regency— only US$185 per night for single/double 
occupancy 

Rooms for the reduced rate are limited! 

Click here to make your hotel reservation 
or use the “Make Hotel Reservation” link 
on the confirmation page of your registration. 

Reservations at the reduced rate can be made through 5:00 pm 
Eastern time on October 5, 2013 — assuming they don’t sell out. 
The number of rooms in the discounted block is LIMITED and 
historically rooms sell out well before the deadline. Don’t wait 
until the last minute to reserve your hotel rooms! 

This rate is available throughout Big Data TechCon. Those who 
reserve their hotel rooms via this reservation link will receive: 

• Complimentary wireless Internet service in their rooms. 

• Overnight self-parking discounted to $8 per day. 

The Hyatt does offer valet parking options at $25 per day. 

Hotel Highlights 

The Hyatt Regency Burlingame is a newly updated hotel lo¬ 
cated on San Francisco Bay between the excitement of downtown 
San Francisco and the technology corridor of Silicon Valley. 



Parking 

The Hyatt parking garage makes for easy arrivals and fast depar¬ 
tures via self parking. A full day of parking is $8. The Hyatt does 
offer valet parking options — at $25/day. 

Complimentary Shuttle Service 

The shuttle is available every day, 24-hours a day and runs every 
10-15 minutes. Take your luggage to the Departures Level, center 
island, and look for the area marked “Hotel Shuttle.” The shuttle is 
a large bus marked “Hyatt Regency and Marriott.” For arrivals 
from Midnight-4:46 am, shuttles pick up every 30 minutes. 

Driving Directions 

From San Francisco International Airport (2 miles): 

Take 101 South toward San lose. Exit Millbrae Ave. Turn left on 
Millbrae Ave. Turn right at the second stoplight onto Bayshore 
Hwy. Proceed through 4 stoplights. Hotel is on right hand side. 

From Oakland Airport (approximately 30 miles) and Points East: 

Take 1-880 South toward San lose. Merge onto CA-92W toward 
San Mateo Br. Merge onto US-101 N toward San Francisco to the 
Broadway Exit. Take the Airport Blvd ramp toward Bayshore Blvd, 
then turn left onto Bayshore Hwy to the hotel. 

From San lose Airport (approximately 30 miles) and Points South: 

Take 101 North to the Broadway Exit. Take the Airport Blvd ramp 
toward Bayshore Blvd, then turn left onto Bayshore Hwy to the 
hotel. 
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1. STUDY. Note the HOW-TO classes and tutorials at 

Big Data TechCon focused on the latest Big Data technologies, 
especially those that are best aligned with your company’s exist¬ 
ing IT infrastructure. Say that this is your first, and most practical, 
opportunity to bring Big Data to your business. 

2. PREPARE. Download the course catalog and 
circle the classes you want to take, and explain why 
the topics relate to your Big Data technical efforts. 

Show that you have found many sessions that 
fit your specific needs, and your company’s 
strategic goals. 

3. JUSTIFY Go in armed with all the necessary 
materials to make a good case for how your 
attending Big Data TechCon will help your company 

make money, save money or improve productivity by helping 
you capture and analyze the data that drives your business. 

4. SHARE ■ Promise to come back from Big Data TechCon and 
hold a brown-bag lunch session to share what you've learned 
with your colleagues, or even conduct formal training within your 
department. In fact, maybe you’ll want to schedule a series of 
brown-bag lunches. 

5. PLAN. Tell management that after you attend 

Big Data TechCon, you'll make definite action plans and 
recommendations to implement new Big Data plans, and to 
improve how your company uses all of the data it captures. 


7. SAVE. The tuition and travel expense of attending Big Data 
TechCon is less than many other conferences. The earlier you sign 
up, the more you save, so explain the benefit of signing up early, 
and for booking your hotel room before the cutoff. 

8. TEAM . Save even more with group discounts. Send three or 
more employees from your company, and save $100 off per 

person. Each person can take different classes and bring back 
even more valuable tips and techniques. (Sending 
10 or more? Contact us for arrangements.) 

9. GROUP. User groups, government employees, 
non-profits and professionals employed by or attending 
educational institutions can also receive special savings. 
Check the website or ask Stacy Burris sburris@bzmedia.com) 
about custom options for your group. 

10. LAUNCH . Classes at Big Data TechCon help you get a 
jump-start on every aspect of Big Data that you have been talking 
about implementing (but haven’t) for months. Whether it’s 
Hadoop, graph databases, NoSQL or another new technology, 
explain that you’ll find the answers here. 

11. DECIDE . While you can sign up anytime, your company 
will save the most if you beat the deadlines. Explain that you 
will help your company’s bottom line by signing up for 

Big Data TechCon today! 


Big Data 
gets REAL at 

BigData 

TECHCON 


6. RELATE. Show how problems or issues you’ve recently 
encountered fit with the classes at Big Data TechCon, and 

discuss the types of technology discussions you'll have with “Big Data TechCon is a great way to raise your awareness on 

the conference faculty and other it professionals. what’s out there for Big Data and gives you ideas on what to 

dig into.” 

—Corey Andalora, Sr. Java Developer, Dealer.com 
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Conference Pricing 


BigData 

— TECHCON 



Register by 

Register by 

Register by 

After 


Aug. 2 

Aug. 30 

Sept. 27 

Sept. 27 

Three-Day Conference 

$1,195 

$1,295 

$1,395 

$1,595 

October 15-17 

SAVE $400 

SAVE $300 

SAVE $200 


Exhibit Hall Only 

FREE 

FREE 

FREE 

FREE 

October 16-17 





Register Online TODAY at www.BigDataTechCon.com! 

Three-Day Conference 


How to Register 

Cancellation and Refund Policy 


Registration Includes: 

• Admission to tutorials and 
technical classes on October 15,16,17 

• Admission to keynotes 

• Admission to the Exhibit Hall 

• Admission to all special events, 
including the Networking Reception 

• Downloadable conference materials 

• Coffee breaks and lunch where indicated 

Exhibit Hall Only 
Registration Includes: 

• Admission to the Exhibit Hall 
•Admission to Networking Reception 



Register online and use one of the 
following payment methods: 

Credit Card. You can use the secure 
online form to pay via credit card and 
get immediate confirmation of your regis¬ 
tration. MasterCard, Visa and American 
Express are accepted. You'll receive a 
registration record and receipt. Please 
print out these pages and bring them with 
you to the Conference. Present them at the 
Registration Desk to pick up your badge 
and course materials. 

Check. Fill out the online registration 
form. Print out the registration record and 
receipt and mail them to BZ Media LLC, 225 
Broadhollow Road, Suite 211, Melville, NY 
11747, with your payment. Online registra¬ 
tions that are mailed without payment will 
not be confirmed until payment is received. 

Purchase Order. If you register using 
a P.O., you'll be invoiced immediately for 
the registration amount. Payment must 
be received before your registration can be 
confirmed. 


You can receive a full refund, less 
a $150 registration fee, for cancellations 
made by Friday, Aug.30, 2013. Cancella¬ 
tions after this date are non-refundable. 
Send your cancellation in writing to 
registration@bzmedia.com. Registrations 
may be transferred to another person. 

Refunds will be processed through the 
same method of payment as the initial 
payment transaction. Credit-card refunds 
will be processed to the same credit card 
as the original payment. 

If for reasons beyond our control 
the conference cannot take place as 
scheduled, BZ Media reserves the right to 
reschedule the conference to a date and 
place of it’s choosing. 

Questions 

Contact Stacy Burris, Event Director, at 
sburris@bzmedia.com or 
+1-631-421-4158 x!08. 


Special Discounts 

You may combine one of these special discounts with the Early Registration pricing to save even more! 


Group. Group discounts will be given automatically if you 
register three or more people at once. You can also contact 
Stacy Burris at sburris@bzmedia.com to receive the $100/person 
discount if your group is unable to register at the same time. 
Contact her also for special discounts for groups of 10 or more. 

Government Employees. Federal, State and Focal Government 
^ central employees can receive an additional $100 off the 
V 1 Three-Day Conference price. Enter code GOV in the 

discount code field. CCR-registered indicates that we are listed in 
the primary supplier database for the Federal Government. 


Educational Institutions. Personnel employed by or attending 
educational institutions can get a $100 discount off the 
Three-Day Conference price by using the code EDU. 

User Groups. Contact Stacy Burris at sburris@bzmedia.com to 
see if your group is eligible for a discount. 

Non-Profit Organizations. Personnel employed by non-profit 
organizations can get a $100 discount off the Three-Day 
Conference price by using the code NONPROFIT. 










